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New Process 

The present invention relates to new processes for improving the manufacture of 
clavams e.g. clavulanic acid. The present invention also provides novel DNA sequences and 
5 new microorganisms capable of producing increased amounts of clavulanic acid. 

Microorganisms, in particular Streptomyces sp. produce a number of antibiotics 
including clavulanic acid and other clavams, cephalosporins, polyketides, cephamycins, 
tunicamycin, holomycin and penicillins. There is considerable interest in being able to 
manipulate the absolute and relative amounts of these antibiotics produced by the 
10 microorganism and accordingly there have been a large number of studies investigating the 
metabolic and genetic mechanisms of the biosynthetic pathways (Demain, A.L. (1990) 
!l Biosynthesis and regulation of (3-lactam antibiotics." in "50 years of Penicillin applications, 
history and trends"). 

Streptomyces clavuligerus produces two major groups of antibiotics; one being the 
15 cephamycins, cephalosporins and penicillins (Demain, A.L. (1990) supra) and the other 

comprising clavams. Clavams can be arbitrarily divided into two groups, 5S and 5R clavams, 
dependent on their ring stereochemistry. The commercially important clavam clavulanic 
acid, a component of the antibiotic Augmentin (trade mark of GlaxoSmithKline), is a 5R 
clavam. Examples of 5S clavams are clavam-2-carboxylate (C-2-C), 2-hydroxymethyl 
20 clavam (2HMC) and alanylclavam (Brown et al (1979) J. Chem. Soc. Chem. pp282-283). 

Genes encoding biosynthetic enzymes and regulatory proteins for clavulanic acid 
production have been located in a cluster next to the genes involved in cephamycin C 
production and make up a supercluster of antibiotic related genes within the S. clavuligerus 
genome (Alexander etal (1998) J.Bacteriol. 180:4068-79). For example the genes encoding 
25 the enzymes involved in clavaminic acid production, a clavulanic acid precursor, which 
include orfl (ceaS) (Khaleeli et al (1999) J. Am. Chem. Soc. 121:9223-9224), orfi (bis) 
(Bachmann and Townsend (1998) Chem. Commun.:2325-2326), orf4 (pah) (Wu et al (1995) 
J. Bacteriol. 177:3714-3720), or/5 (cas2) (Marsh etal (1992) Biochemistry. 31:12648-57) 
and perhaps orf6 (Kershaw et al (2002) Eur. J. Biochem. 269,2052-2059) are all located 
30 within the clavulanic acid cluster. Disruptions in orfs2-6 cause a complete loss of clavulanic 
acid production when mutant cultures are grown on starch asparagine medium (Aidoo, K.A. 
et al (1993) p219~236 In. V.P. Gullo, J.C. Hunter-Cevera, R. Cooper and R. K. Johnson (ed.), 
Developments in Industrial Microbiology series, vol.33 Society for Industrial Microbiology, 
Fredericksburg, Va.). However this loss is conditional upon the growth media used for when 
35 mutants are grown on Soy medium (Salowe et al (1990) Biochemistry 29: 6499-6508) 

clavulanic acid production is partially restored (Jensen et al (2002) Antimicrob. Agents and 
Chemother. 44: 720-726). This phenomenon could suggest that other genes present in the S. 
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clavuligerus genome could compensate in some way for the loss of the activity of these genes 
under certain conditions. Alternatively it could be that the Soy media contains very small 
amounts one or more of the metabolites produced by the orfs 2-6 allowing strains disrupted in 
these genes to make small amounts of clavulanic acid. 

5 Marsh et al. (1992) supra has reported that S. clavuligerus contains two copies of 

the cas gene (casl and casl). casl is not associated with the clavulanic acid gene cluster and 
has a high homology to casl. Disruption of casl decreases clavulanic acid production by 35% 
when cultures are grown on Soy medium and eliminates production entirely when cultures are 
grown on starch asparagine (SA) medium (Paradkar and Jensen 1995 J.Bact 177: 1307- 

10 1314). The disruption of the casl gene results in mutants which produce near wild type levels 
of clavulanic acid on SA medium, but produce 3 1-73% less clavulanic acid when grown on 
Soy medium than the wild type (Mosher et al (1999) Antimicrob. Agents and Chemother. 43: 
1215-1224). It is also reported that in mutant strains where both the casl and casl genes have 
been disrupted no clavulanic acid is produced under any of the fermentation conditions tested. 

15 Interestingly when the genes surrounding casl were sequenced, no additional genes involved 
in clavulanic acid production were found but instead six novel genes involved in 5S clavam 
biosynthesis (named cvml to 6) were identified. (Mosher et al (1999) supra). Further work 
on these 5S clavam-specific genes showed that disruption of the genes, using genetic 
engineering methodologies, leads to improvements in the levels of clavulanic acid made by 

20 the mutant strains and also dramatic reductions in the levels of 5S clavam production 

(W098/33896). This reduction in 5S clavam production, in particular the 5S clavam clavam- 
2-carboxylate, is especially important in the commerical production of clavulanic acid 
because some 5S clavams are known to be toxic and for this reason the levels are tightly 
controlled within the British and US Pharmacopoeias. 

25 Despite these advances in the understanding of clavulanic acid biosynthesis it is still a 

highly desirable goal in the pharamceutical industry to continue to improve production 
methods for clavulanic acid, both for reasons of cost and for reasons of safety. 

The following definitions are provided to facilitate understanding of certain terms 
used frequently herein: 

30 "Gene" as used herein also includes any regulatory region required for gene function 

or expression. 

"cv/tt" genes as used herein refers to any of the genes cvml, cvm2 9 cvm3 9 cvm4 9 cvm5 9 
cvm6 or cvm7 as defined hereinabove. 

" cvmpara" genes as used herein refers to any of the genes cvm6para or cvm7para as 
3 5 defined hereinabove. 

" orf genes as used herein refers to any of the genes orf2 9 orf3 9 or/4 9 orp 9 or/6 9 or/7 9 orf8 9 
orf9 9 oifl0 9 off 11 9 orf 12 9 orfl3 9 orfl4 9 orfl5 9 orfl6 9 oifl7 9 or orf 18 as defined hereinabove. 

2 
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"orfpara" genes as used herein refers to any of the genes orfipara, orfipara, orf4para or 
orfSpara as defined hereinabove. 

"Disrupted" as used herein means that that the activity of the gene (with regard 5S clavam 
production) has been reduced or eliminated by, for example, insertional inactivation using an 
5 antibiotic resistance gene, preferably apromycin (Paradkar, A.S and Jensen, S.E (1995) supra), or 
other mutagenesis technique (for example those disclosed in Sambrook et al (1989) supra). Other 
mutagenesis techniques include insertion of other DNAs (not antibiotic resistance genes), site- 
directed mutagenesis to either change one or more bases in the gene sequence or insert one or 
more bases into the sequence of the gene. 

10 "Deleted" as used herein means that the gene, or a segment thereof^ has been deleted 

(removed) from a larger polynucleotide which, before the deletion was performed, included said 
gene or segment thereof. When the polynucleotide bearing the deletion is introduced into the 
genome of the microorganism by means of gene replacement technology (Paradkar and Jensen 
(1995) supra) the activity of the gene or protein encoded thereby is eliminated or reduced such 

15 that the levels of 5S clavam produced by the microorganism are reduced. The deletion may be 

large (for example the complete open reading frame with or without regulatory control regions) or 
small (for example a single base pair resulting in a frameshift mutation). 

"Reduced" as used herein means that the levels of 5S clavam produced by the 
microorganism of the invention are lower than the levels produced in the corresponding S. 

20 clavuligerus strain which has not had the relevant open reading frames disrupted or deleted. The 
corresponding S. clavuligerus is therefore the "parent" strain into which the disrupted or deleted 
open reading frames were subsequently introduced to generate the microorganism of the 
invention. 

"At least maintained" as used herein means that the level of clavulanic acid produced 
25 in the microorganism of the invention is the same or greater than that produced in the 
corresponding S. clavuligerus strain which has not had the relevant open reading frames 
disrupted or deleted. The corresponding S. clavuligerus is therefore the "parent" strain into 
which the disrupted or deleted open reading frames were subsequently introduced to generate 
the microorganism of the invention. 

30 

The present invention concerns new processes for making clavulanic acid using 
newly identified S. clavuligerus genes. Using a probe derived from off 4 a fragment of the S. 
clavuligerus genome has been isolated and has been shown to comprise a number of genes 
that when disrupted are shown to affect 5S and 5R clavam biosynthesis in S. clavuligerus. 
35 Sequence analysis of the fragment has indicated the presence of a gene showing high 
similarity to orf4 (hereinafter called orf4par). However surprisingly further sequence 
analysis of the regions flanking the orf4par gene has revealed a new cluster of genes 
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comprising paralogies of genes previously identified in both the clavulanic acid (cas2 cluster) 
and 5S clavam (casl cluster) gene clusters. 

Accordingly the invention provides a S. clavuligerus microorganism comprising DNA 
5 corresponding to one or more open reading frames essential for 5S clavam biosynthesis, wherein 
said open reading frames are disrupted or deleted such that the production of 5S clavams by said 
S. clavuligerus is reduced and clavulanic acid production is at least maintained, wherein the open 
reading frames are selected from: 
a) cvm6para (SEQ ID NO: 1); 
10 b) cvm7para (SEQ ED NO:2); 

c) cvm6para and cvm6 (SEQ ID NO:5); or 

d) cvm7para and cvm7 (SEQ ED NO:6). 

In a second aspect the invention provides a S. clavuligerus microorganism comprising 
DNA corresponding to one or more open reading frames essential for 5S clavam biosynthesis, 
15 wherein said open reading frames are disrupted or deleted such that the production of 5S clavams 
by said S. clavuligerus is reduced and clavulanic acid production is at least maintained, wherein 
the open reading frames are selected from: 

a) cvm6para and one or more of cvml (SEQ ID NO:7), cvm2 (SEQ ID NO:8), cvm3 (SEQ ID 
NO:9), cvm4 (SEQ ID NO: 10), cvm5 (SEQ ID NO: 1 1), cvm6 9 cvm7 or cvm7para\ or 

20 b) cvm7para and one or more of cvml, cvm2, cvm3 9 cvm4 9 cvm5, cvm6, cvm7 or cvm6para. 

The genes cvml, cvm2 9 cvm3 9 cvm4 9 cvm5 and cvm6 are disclosed in Mosher et al (1999) 
supra and W098/33896 (cvml is orfupl, cvm2 is orfup2 9 cvm3 is orfup3 9 cvm4 is ordwnl, cvm5 is 
orfdwn2 and cvm6 is orfdvm3. The cvml gene, found to be a further 5S clavam specific gene of 
the 5S clavam (casl) cluster, has been identified during work leading to the present invention and 

25 is disclosed hereinbelow. 

In a further aspect the invention provides isolated polynucleotides comprising the 
cvm6para and cvmlpara open reading frames which are used in the preparation of the & 
clavuligerus microorganism of the invention. Preferably said polynucleotides comprise open 
reading frames selected from the group consisting of: 

30 a) cvm6para 9 

b) cvm7para\ 

c) cvm6para and cvm6\ 

d) cvm7para and cvm7; 

e) cvm6para and one or more of cvml, cvm2, cvm3, cvm4 9 cvm5 9 cvm6 9 cvml or cvm7para; or 
35 f) cvm7para and one or more of cvml, cvm2 9 cvm3 9 cvm4 9 cvm5 9 cvm6 9 cvml or cvm6para. 

In another aspect the present invention provides vectors for cloning and manipulating the 
cvm polynucleotides disclosed herein and which can be used in the preparation of the S. 
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clavuligerus microorganism of the invention. Processes for using these vectors to make the 51 
clavuligerus microorganism of the invention are also provided. 

The encoded polypeptides from cvm6para and cvm7para are also provided by the 
invention (SEQ ID NO:3 and SEQ ID NO:4 respectively). 
5 The invention further provides a polynucleotide comprising one or more open reading 

frames encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said open 
reading frames are selected from the group consisting of: 

a) orfipara (SEQ ID NO: 12), 

b) orfipara (SEQ ID NO: 13), 

10 c) orf4para (SEQ ID NO: 14), and 

d) orf6para (SEQ ID NO:15). 

In a further aspect the invention provides a polynucleotide comprising one or more open 

reading frames encoding one or more enzymes involved in clavulanic acid biosynthesis wherein 

said open reading frames comprise one or more of: 
15 a) orfipara, 

b) orfipara, 

c) or/4para, 

d) or/6para 

in combination with one or more genes involved in clavulanic acid biosynthesis selected from 
20 or/2, or/3, or/4, or/5, or/6 9 or/7, or/8, or/9, or/10 (Canadian patent application CA21081 13 and 
Jensen, S.E et al (2000) Antimicrob. Agents Chemother 44:720-6) or/11, orfl2 (Li, R.N et al 
(2000) J. Bacteriol 182:4087-95), or/73, orfl4 9 or/15, or/16, or/17, or or/18 (patent application 
PCT/GB02/04989). 

Vectors comprising such polynucleotides are also provided by the present invention 
25 together with processes for the use of such vectors to prepare strains of Streptomyces clavuligerus 
which can be used to produce elevated levels of clavulanic acid. 

Strains of Streptomyces clavuligerus so produced and methods for using them to produce 
clavulanic acid by fermentation are also provided. 

Thus the invention further provides a Streptomyces clavuligerus microorganism 
30 comprising a vector comprising a polynucleotide comprising one or more open reading frames 

encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said open reading 
frames are selected from the group consisting of: 

a) orfipara, 

b) orfipara, 

35 c) or/4para, and 
d) or/Spara. 

5 
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In a further aspect the invention provides a Streptomyces clavuligerus microorganism 
comprising a vector comprising a polynucleotide comprising one or more open reading frames 
encoding one or more enzymes involved in clavulanic acid biosynthesis wherein said open reading 
frames are selected from the group consisting of: 
5 a) orf2para> 

b) orflpara, 

c) orf4para 9 

d) orf6para 

in combination with one or more genes involved in clavulanic acid biosynthesis selected from 
10 orf2, orf3 9 orf4 9 orf5 9 or/6, orf7 9 orfS 9 orf9, orflO (Canadian patent application CA21081 13 and 
Jensen, S.E et al (2000) Antimicrob. Agents Chemother 44:720-6) or/11 , orfl2 (Li, R.N et al 
(2000) J. Bacteriol 182:4087-95), orfl3 9 orfl4 9 orfl5 9 orfl6 9 orfl7 9 or or/75 (patent application 
PCT/GB02/04989). 

The present invention also contemplates a S. clavuligerus micororganism comprising a 

15 combination of one or more disrupted or deleted cvm6para or cvm7para genes, optionally in 

combination with other disrupted or deleted 5S genes previously disclosed, together with vectors 
comprising orftpara, orftpara, orf4para or orf6para genes, optionally in combination with other 
clavulanic acid biosynthetic genes (selected from the genes or/2 to or/18) previously disclosed. 

Polynucleotides of the invention can be isolated by conventional cloning methods, such as 

20 PCR or library screening methods, using the sequences disclosed herein and in Mosher et al 
(1999) supra, W098/33896, Canadian patent application CA21081 13, Jensen, S.E et al (2000) 
supra), Li, R.N et al (2000) supra and patent application PCT/GB02/04989, as indicated 
hereinabove. Examples of such cloning methods are described in, for example, Sambrook, J et al 
(1989) Molecular cloning, a laboratory manual (2nd Ed) Cold Spring Harbor Laboratory, Cold 

25 Spring Harbor, New York. 

Polynucleotides comprising individual open reading frames can be isolated and ligated 
together into vectors in a variety of combinations as defined hereinabove using techniques well 
know in the art. The choice of vector will depend on the function being carried out, for example 
cloning, expression, gene inactivation or transfer into S. clavuligerus eg. for gene replacement. In 

30 all cases a variety of vectors are available to the skilled person and are well known in the art. For 
example such vectors are known from Sambrook, J et al (1989) supra for general cloning vectors 
Hopwood, D.A et al (1985) supra for Streptomyces vectors, Paradkar and Jensen (1995) supra, 
Mosher et al (1999) supra and W098/33896 supra for gene disruption and gene replacement 
vectors and CA21081 13 supra for vectors suitable for expression of genes in Streptomyces 

35 clavuligerus. However the choice of vector is not limited to just those disclosed in these sources. 

Further, in the case of the gene combinations involving the orflpara, orfipara, or/4para 3 
orfSpara and orfSpara genes the skilled artisan would be able to design suitable DNA constructs 
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to ensure that each open reading frame is suitably positioned relative to a transcriptional promoter, 
whether this be the native promoter or a heterologous promoter that also functions in the 
Streptomyces clavuligems background, or indeed other regulatiry sequence, in such a manner that 
expression of each open reading frame is optimally achieved. 

5 Subsequent manipulation of the polynucleotides, in particular with respect their 

introduction into the Streptomyces clavuligems background, can be carried out according to 
standard methods as disclosed in, for example, Hopwood, D.A et al (1985) supra. Disruption of 
gene sequences, and subsequent gene replacement, can be carried out according to the method of 
Paradkar, A.S and Jensen, S.E (1995) supra. Deletion of gene sequences can be carried out using 

10 well established techniques, for example that disclosed in W098/33896. 

Microorganisms of the invention can be prepared from Streptomyces clavuligems strains 
including, but not limited to, Streptomyces clavuligems ATCC 27064 (American Type Culture 
Collection, Manassas, Virginia, USA), alternatively available as NRRL 3585 (Northern Regional 
Research Laboratory, Peoria, Illinois, USA). For example mutant strains of Streptomyces 

15 clavuligems can also be used including those prepared by genetic engineering techniques, or those 
prepared by strain improvement methods. Examples of such strains include Streptomyces 
clavuligems strains 56-1A, 56-3A, 57-2B, 57-1C, 60-1A, 60-2A, 60-3A, 61-1A, 61-2A, 61-3A or 
61-4A as disclosed in W098/33896. 

Thus in another aspect the invention relates to a process for improving clavulanic acid 

20 production in a suitable microorganism comprising isolating a polynucleotide as described 

hereinabove, manipulating said polynucleotide, introducing the manipulated polynucleotide into a 
said suitable microorganism, fermenting said suitable microorganism under conditions whereby 
clavulanic acid is produced, isolating and purifying clavulanic acid so produced. Manipulation of 
said polynucleotide may be by means of disrupting or deleting gene sequences in the case of 

25 cvmpara genes, optionally together with cvm genes, or by inserting into vectors suitable for 
expression in the case of orfpara genes, optionally together with orf genes. 
Preferably the suitable microorganism is Streptomyces clavuligems. 
Such fermentation, isolation and purification methods are well known in the art, for 
example the fermentation methods disclosed in UK Patent Specification No. 1,508,977. Methods 

30 for using clavulanic acid in the preparation of antibiotic formulations are similarly well known in 
the art. 

Examples 

Example 1 - Materials and Methods 
35 In the examples all methods are as described in Sambrook, J. et al supra, Hopwood, D.A. 

et al. (1985) supra and Kieser, T et al. (2000) Practical Streptomyces Genetics, unless 

7 
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otherwise stated. Transformation methods can also be found in Paradkar, A.S. and Jensen, SJE 
(1995) svpra. 

1.1 Bacterial strains, media and culture conditions. 

5 Streptomyces clavuligerus NRRL 3585 was obtained from the Northern Regional 

Research Laboratory (Peoria, EL). S. clavuligerus was maintained on either MYM agar 
(Stuttard, C. (1982) J. Gen. Microbiol. 128:1 15-121) or ISP Medium #4 agar plates (Difco, 
Detroit, MS). 

Cultures for the isolation of chromosomal DNA were grown on a 2:3 mixture of 
10 trypticase soy broth and YEME as described by Alexander et al.(1998) J.Bact. 180:4068-79. 
Cultures for analysis of the production of clavulanic acid and other clavam metabolites were 
grown on Soy medium (European Patent 0349 121) unless otherwise stated. All liquid 
cultures were grown at 26°C on a rotary shaker at 250 rpm. 

Manipulation of DNA in Escherichia coli was done using strain XL-1 Blue (Stratagene, 
15 La Jolla, CA). E. coli cultures were maintained on LB agar medium and grown in liquid 

culture in LB medium at 37°C (Sambrook, J et al (1989)swpra). Plasmid-containing cultures 
were supplemented with appropriate levels of antibiotic. 

1.2 DNA manipulations. 

20 Standard DNA manipulations such as plasmid isolation, restriction endonuclease 

digestion, generation of blunt-ended fragments, ligation, 32 P labelling of DNA probes by nick 
translation and E. coli transformation were carried out as described in Sambrook J et al (1989) 
supra). Plasmid and genomic DNA isolation from Streptomyces spp. was conducted as 
described in Kieser, T et al (2000) supra. Construction of a library of S. clavuligerus genomic 

25 DNA fragments in the cosmid pWE15 was carried out according to the manufacturer's 
instructions (Stratagene). 

Southern analysis of S. clavuligerus DNA fragments was conducted at high stringency as 
described by Sambrook, J et al (1989) supra. Hybridization membranes were washed twice 
for 30 min at 2xSSC/0.1% SDS and once for 30 min at 0.1xSSC/0.1% SDS, all at 65°C. 

30 

Example 2 - Preparation of the paralogue cluster DNA fragment 

2.1 Cloning and nucleotide sequencing of the or/4 paralogue 

A strong and a very weak hybridization signal was consistently observed on Southern 
35 blots of iVcoI-digested S. clavuligerus chromosomal DNA when probed with the or/4 gene 
(CA21081 13). The strong signal corresponded to the or/4 gene, but the identity of the gene 
that gave rise to the very weak signal was unknown. Therefore it was decided to clone this 
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gene. To this end, Ncol fragments from 51 clavuligems DNA of approximately 4-5kb in size 
were ligated into Ncol digested pUC120 (Vieira, J and J Messing (1987) Methods EnzymoL 
153, 3-1 1) and screened using a colony blot hybridisation method and employing the orfA 
gene as a probe. Plasmid DNA was isolated from potential positive clones and confirmed to 
5 carry a 4.3 kb Ncol fragment. A representative clone, p04H-4, was chosen for further study. 
The sequencing of the 4.3 kb Ncol fragment was carried out. Analysis of the sequence 
generated identified three genes, one which had homology to orfA and was called orf4par. 
The two other genes present were found to have homology with or/6 and cvm6 and were 
therefore called orf6par and cvm6par. This result suggested that this region of DNA may 
10 contain a cluster of genes with paralogues in either the clavulanic acid biosynthetic gene 
cluster or the cvm clavam biosynthetic gene cluster. 

2.2 Sequencing of DNA flanking the 4.3 kb Ncol fragment containing or/4par 

Sequence analysis of DNA flanking the 4.3 kb Ncol fragment containing or^par was 

15 achieved by identifying 2 cosmid clones containing the orf4par gene. The two cosmid clones 
containing orf4par,14E10 and 6G9, were isolated from a S. clavuligems pWE15 (Promega, 
Madison, WI) cosmid bank that had been probed with a 0.46Kb Sail fragment that is internal 
to the orf4par gene. These cosmids have been partially mapped using a series of digestions 
and Southern hybridization experiments (In. Nucleic acid techniques in bacterial systematics. 

20 Ed. Stackebrandt, E and Goodfellow, M (1991) John Wiley and Sons, p205-248). Digestion 
of both cosmids with EcoRI, Kpnl and Nrul suggest that the insert size of 14E10 is 
approximately 45 kb and 6G9 is approximately 40 kb. These two cosmid inserts have about 
20 kb of overlapping DNA and provided DNA for sequence analysis of regions upstream and 
downstream of the 4.3 kb Ncol fragment containing or/4par. 

25 DNA sequence information was generated essentially as described in CA21081 13. The 

DYEnamic ET Terminator Cycle Sequencing Kit (Amersham Pharmacia, Baie d'Urfe, 
Quebec, Canada) was used. Approximately 13.3 kilobases of contiguous DNA sequence was 
generated. The nucleotide sequence of the S. clavuligems chromosomal DNA generated in 
these experiments is shown in SEQ ID No: 16. 

30 A number of open reading frames were identified which displayed significant homology 

with the previously described or/2, otfi, orf4 , and or/6 (CA21081 13). These genes have 
been located within the genome in relation to each other, and are found to be nearly in the 
same organisation as that of the genes within the clavulanic acid cluster. The genes orf2par, 
orftpar and orf4par are adjacent to each other and in the same orientation as their 

35 counterparts or/2, or/3 and or/4. However casl is not downstream of orf4par as cas2 is to 
orf4 in the clavulanic acid pathway but is instead within the clavam cluster (Mosher et al 
(1999) supra). Another difference between the clavulanic acid cluster and the paralogue 
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arrangement is that orf6par is end-on-end to orf4par 9 and so is not in the same orientation as 
or/2par-4par, whereas orf6 is in the same orientation as orfs2-4 in the clavulanic acid cluster. 
Suprisingly the gene immediately upstream of orf6par 3 was found to be a gene that had a 
paralogue in the clavam and not the clavulanic acid cluster. This gene was called cvm6par, as 
5 it is a paralogue of the cvm6 gene found clustered with casl (Mosher et al (1999) supra). The 
cvm6 gene encodes an enzyme that is involved in clavam production (orfdwn3 in 
W098/33896). 

Located adjacent to cvm6par is a new gene called cvm7par. This gene shows homology 
to cvm7 9 a gene that is located upstream of cvm3 in the clavam cluster (further described 
10 hereinbelow). Upstream of cvml is a new open reading frame, believed to encode a sensor 
kinase. It encodes an polypeptide of 555 amino acids and shows good similarity to sensor 
kinase domains of two component response regulator genes. 

2.3 Functional analysis of the open reading frames 
15 Computer analysis of the DNA sequence shown in SEQ ID No. 16 predicts the presence 

of 7 open reading frames. A description of each gene is shown in Table 1. 



Table 1 



Orf Designation 


Homology 

(blast P) 


orflpar 


acetolactate synthase 

(67% identity to orfZ carboxyethyl arginine 
synthase CEAS; 


orftpar 


asparagine synthetase 

(49% identity with orf3 P-lactam synthase 

ELS) 


orf4par 


amidinohydrolase 

(71% identity with or/4 amidinohydrolase 
PAH) 


orf6par 


ornithine acetyltransferase 
(47% identity with orf6 ornithine acetyl 
transferase OAT) 


cvmdpar 


aminotransferase 

(66% identity with cvm6 acetylornithine 
aminotransferase) 
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cvm7par 


Transcriptional regulator 


Sensor Kinase 


Sensor Kinase 

47% identity with 2 component system from 
S.coelicolor A3 (2) 



To assess the possible roles of these ORFs in the biosynthesis of clavulanic acid and/or 
clavams produced by S. clavuligerus, insertional inactivation mutants were created by gene 
replacement essentially as described by Paradkar and Jensen (1995) supra. 

5 However, in order to definitively define the phenotype of these disruptions, it was considered 
important to disrupt orftpar, orf4par, orf6par and cvmSpar not only in wild type S. 
clavuligerus, but also in strains of S. clavuligerus that were already defective in the 
expression of or/3, orf4, orf6, and cvm6 respectively. The or/3 \4 and 6 mutants were made as 
described in United States Patent No. 6,332,106 and the cvm6 mutant made as described in 

10 W098/33896. 



Example 3 - Analysis orf 4, and orf4par 
3.1 Construction of orf4 mutants 

Mutants disrupted in orf4 (pah) were made as described in United States Patent No. 
15 6,332,106. 



3.2 Construction of orf4par mutants 

p04H-4 (4.3kb Ncol fragment cloned into the Ncol site of pUC120 (Vieira and Messing 
1987 supra) was digested with Kpnl (one site in the cloned fragment and one site in the 

20 vector) and religated to reduce the size of the orf4par-beaxing DNA insert to 1 .7kb thereby 
generating the plasmid p4K-l. The orf4por gene within p4K-l was disrupted by digestion at 
its centrally located EcdNl site and insertion of the apramycin (apr) resistance gene cassette 
from pUC120apr (Trepanier et al. (2002) Microbiology 148: 643-656) after both fragments 
had been made blunt by treatment with the Klenow fragment of DNA polymerase L The 

25 KpnUNcol insert carrying the disrupted orf4par gene was then inserted into the EcdSl site of 
pDA501 after blunting the ends of both insert and vector. pDA501 is a shuttle vector prepared 
by fusing the Streptomyces plasmid pIJ486 (Kieser, T et al (2000) supra) to the Kcoli 
plasmid pTZl 8R (Stratagene) by means of their EcoRI and BaniHl sites. The resulting 
construct, 6pDAB, was used to transform SMvidans TK24, and finally wild-type & 

30 clavuligerus to thiostrepton (thio at 5jng/ml) and apramycin (apr at 20jj,g/ml) resistance. 
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Gene replacement mutants were generated as described by Paradkar and Jensen (1995) 
supra. 

3.3 Construction of orf4/orf4par mutants 

5 An approach was undertaken to generate the double mutant by transforming protoplasts 

of the oif4par (api^) mutant with the or/4 (ihio 1 ) disruption construct (Aidoo et al (1994) 
Gene. 147:41-6). Protoplast preparations from orf4par mutants, were transformed with the 
orf4 disruption construct isolated from S.lividans. Transformants were selected on 
thiostrepton at 5ng/ml and hygromycin (hyg) at 50jxg/ml. Primary transformants were put 

10 through two rounds of sporulation under non- selective conditions, in order to generate gene 
replacement mutants as described by Paradkar and Jensen (1995) supra. 

3.4 Fermentation analysis of orf4. orf4par and orf4/orf4par mutants 

To test the effect of disrupting orf4 t or/4par and orf4/4par on clavulanic acid 
15 biosynthesis, spores from each isolate were inoculated into 20ml of seed medium (European 
patent 0 349 121) and grown for 2 days at 26°C with shaking. 1ml of the seed culture was 
then inoculated into a final stage Soy medium (European Patent 0349 121) and grown at 26°C 
for up to 3 days with shaking. Samples of final stage broth were withdrawn after three days 
growth and assayed for clavulanic acid productivity by HPLC (Mosher et al (1999) supra) 
20 and/ or using an imidazole derivatised colorimetric assay (Bird, A.E. et al (1982) Analyst, 
107: 1241-1245 and Foulston, M. and Reading, C. (1982) Antimicrob. Agents Chemother., 
22:753-762). 

Fermentation analysis of orf4 disruptant 

The orf4 disruptant was fermented in Soy medium and compared to wild type S. 
25 clavuligerus for production of clavulanic acid. After 72hrs growth, accumulation of 
clavulanic acid was reduced by 71%. 

From these results it can be concluded that orf4 is required for efficient production of 
clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic acid 
levels. 

30 Fermentation analysis of orf4par disruptant 

Mutant 5pDA defective in the orf4par gene was fermented in Soy medium and 
compared to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, 
accumulation of clavulanic acid was reduced by 12%. 

From these results it can be concluded that, like orf4, or/4par contributes to 
35 clavulanic acid biosynthesis as elimination of this gene by disruption causes a reduction in 
clavulanic acid levels. 

Fermentation analysis of orf4/orf4par disruptants 
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When mutants A4-A1 and 3A3-A3, defective in both copies of the orf4 genes were 
grown in Soy medium production of clavulanic acid could not be detected. 

From these results it can be concluded that under the conditions tested, both genes, 
orf4 and orf4par, contribute to clavulanic acid biosynthesis as the double disruption, results in 
5 a mutant unable to make clavulanic acid. 

3.5 Southern Analysis 

The orf 4, orf4par and orf4/4par mutants were further characterised by Southern 
analysis. The results confirmed that in these mutants the chromosomal copies of the relevant 
10 genes had been disrupted as expected. 

Example 4 - Analysis of or/6 and orf6par 

4.1 Construction of orf6 mutants 

or/6 mutants were made as described in United States Patent No. 6,332,106 

15 

4.2 Construction of orfSpar mutants 

The orf6par gene was disrupted by introduction of a neomycin resistance gene (neo 1 ) into 
the RsrU site, approximately midway through the coding region. In order to achieve this 
p04H-4 was digested with Kpnl to remove orf4par and self ligated to give p5K-6. p5K-6 was 
20 digested with RsrU and the neomycin resistance gene, released from pFDNeo-S (Denis and 
Brzezinski (1992) Gene 111:1 15-118.) as a PstUEcdRl fragment, was inserted after both 
fragments had been made blunt by treatment with the Klenow fragment of DNA polymerase 
I. The construct pNeo5K-6A was obtained which has the neo R gene in the same orientation as 
the orf6par gene. 

25 A shuttle vector called pNeo5K-6Atsi# 14 was constructed by inserting pIJ486 as a 

6.2 Kb fragment linearised with BglU 9 into the BamHL polylinker site of pNeo5K-6A. The 
shuttle vector was used to transform S. lividans TK24 and finally S. clavuligerus WT to 
thiostrepton (5|ig/ml) and neomycin (50p,g/ml) resistance. Primary transformants were 
subjected to two rounds of sporulation under non- selective conditions in order to generate 

30 gene replacement mutants as described by Paradkar and Jensen (1995) supra. 

4.3 Construction of orf6/orf6par mutants 

orf6/orf6par double mutants were generated by transforming protoplasts of the orf6par 
(neo 1 ) mutant with the O7/6(api0 disruption construct (Mosher et al (1999) supra). Protoplast 
35 preparations from orfSpar mutants, were transformed with the or/6 disruption construct 

isolated from S.lividans. Transformants were selected on apramycin (apr) at SO^tg/ml. Primary 
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transform ants were put through two rounds of sporulation under non- selective conditions in 
order to generate gene replacement mutants as described by Paradkar and Jensen (1995) 
supra. 

5 4.4 Fermentation of or/fc orf6par and orf6/orf6par mutants 

To test the effect of disrupting orf6, orf6par and orf6/orf6par on clavulanic acid 
biosynthesis, spores from each isolate were. tested as previously described in section 3.4. 
Fermentation Analysis of orf6 mutants 

Mutant 6-1 A defective in the orf6 gene was fermented in Soy medium and compared 

10 to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, 

accumulation of clavulanic acid was reduced by 57%. From these results it can be concluded 
that orf6 is required for efficient production of clavulanic acid as elimination of this gene by 
disruption causes a reduction in clavulanic acid levels. 
Fermentation Analysis of orfSpar mutants 

15 Mutant 14-2B(2) defective in the orf6par gene was fermented in Soy medium and 

compared to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, 
accumulation of clavulanic acid was reduced by 27%. From these results it can be concluded 
that, like orfS, orfSpar contributes to clavulanic acid biosynthesis as elimination of this gene 
by disruption causes a reduction in clavulanic acid levels. 

20 Fermentation Analysis of orf6/orf6par mutants 

Two separate mutants defective in both or/6 and orf6par were fermented in Soy 
medium and compared to wild type S. clavuligerus for production of clavulanic acid. After 
72hrs growth, accumulation of clavulanic acid was reduced by an average of 65%. 

From these results it can be concluded that both or/6 and orfSpar are necessary for 

25 efficient production of clavulanic acid since disruption of either copy of the gene causes a 
reduction in clavulanic acid production. Inactivation of both copies of the gene caused a 
further decrease, but not a complete loss of clavulanic acid producing ability. 

4.5 Southern Analysis 

30 The or/6 , orfSpar and orfS/orfSpar mutants were further characterised by Southern 

analysis. The results confirmed that in these mutants the chromosomal copy of the relevant 
gene had been disrupted as expected. 

Example 5 - Analysis of cvm6 and cvm6par 
35 5.1 Construction of cvm6 mutants 

Construction of mutants disrupted in cvm6 has already been described in 
WQ98/33896 (cvm6 is orfdwnS). 
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5.2 Construction of cvm6par mutants 

A 1 .7 Kb Sail fragment containing cvm6par was released from p04H-4 and ligated 
into pUC118 at the Sail site. The resulting plasmid was digested with EcdNl to release a 140 

5 bp fragment internal to cvm6par. In place of this fragment, the neomycin resistance gene from 
pFDNeo-S, released as an EcdRUPstl fragment, was ligated into cvm6par after both 
fragments had been made blunt by treatment with the Klenow fragment of DNA polymerase 
I. The neo R marker was inserted in the same orientation as cvm6par. The neomycin containing 
SaR fragment was released with EcoRI and inserted into the shuttle vector pUWL-KS 

10 (Weimeier, U.F (1995) Gene 165: 149-150.) at the EcoRI site. The construct was named 
P NeoSall.7U. 

The plasmid pNeoSall.TQ was used to transform SAividans TK24, and finally 
S.clavuligerus wild type. The resulting cvm6par::neo transformants were selected on MYM 
medium with 50ng/ml neomycin and S^g/ml thiostrepton and then subjected to two rounds of 
15 sporulation under non- selective conditions to give double cross-over mutants. 

5.3 Construction of cvm6/cvm6par mutants 

The construct pNeoSall .7U isolated from S.lividans TK24 was also used to transform 
the cvm6 mutant 56-3A, where the apr* cassette was inserted into cvm6 in the same 
20 orientation as the gene. Transformants were grown on MYM medium with 50fig/ml 

neomycin and Sjig/ml thiostrepton. The mutants were put through two rounds of sporulation 
under non- selective conditions as described above and double cross-over mutants were 
isolated. 

25 5.4 Fermentation of cvm6. cvm6par and cvm6/cvm6par mutants 

To test the effect of disrupting cvm6, cvm6par and cvm6/cvm6par on p-lactam 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analysis of cvm6 mutants 

It was reported in W098/33896 that mutants 56-1 A, 56-3A, 57-1C and 57-2B 
30 defective in the cvm6 gene produced elevated levels of clavulanic acid (125-141% of the 
control strain) and greatly reduced levels of clavam-2-carboxylate and 2- 
hydroxymethylclavam when cultured in Soy medium. 

These results suggest that the cvm6 gene is required for efficient production of the 5S 
clavams. Disruption of cvm6 not only results in a reduction in clavams but also a 
35 simultaneous increase in clavulanic acid. 

Fermentation Analysis of cvm6par mutants 
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Mutants 3A1, 3A2, 2A-6, 2B-1 and 2B-2 defective in the cvm6par gene were fermented in 

Soy medium and compared to wild type S. clavuligerus for production of |3-lactam 

metabolites. After 72hrs growth, accumulations in clavulanic acid were increased by 6-11%. 

Production of davam-2-carboxylate and alanyl clavam was abolished and levels of 2- 
5 hydroxymethyl clavam reduced by 5 0-85%. 

These results suggest that like cvm6 the cvm6par gene is required for efficient 

production of the 5S clavams. Disruption of cvm6par not only results in a reduction in 

clavams but also a simultaneous increase in clavulanic acid. 

Fermentation Analysis of cvm6/cvm6par double mutants 
10 Mutants A-l, A-2, B-l, B-2, C-l and C-2 defective in both the cvm6 and cvm6par 

genes were grown in Soy medium and compared to wild type S. clavuligerus for their 

production of P-lactam metabolites. Production of clavulanic acid was increased by 12-27%, 

production of alanyl clavam and clavam-2-carboxylate eliminated and levels of 2- 

hydroxymethyl clavam reduced by 70-83%. 
15 These results indicate that, like the cvm6 and cvm6par single mutants, the cvm6lcvm6par 

double mutants produced elevated levels of clavulanic acid and both genes are required for 

the efficient production of 5S clavams. 



5.5 Southern Analysis 

20 The cvm6, cvm6par and cvm6/cvm6par mutants were further characterised by Southern 

analysis. The results confirmed that in these mutants the chromosomal copies of the relevant 
genes had been disrupted as expected. 

Example 6 - Analysis of or/3 and orfipar 
25 6.1 Construction of orf3 mutants 

Mutants disrupted in or/3 were made as described in United States Patent No. 6,332,106. 



6.2 Construction of orffpar mutants 

The plasmid p5.7£coRI ref (pJOE based hyg) was used as the disruption template for 

30 orftpar. The insert in this plasmid is approximately 5.7kb and includes part of cvm6par, all of 
or/6par 9 orffpar, orfipar and part of or/lpar all carried within the plasmid pJOE829 (Kieser, 
T et al. (2000); Aidoo et al (1994) Gene. 147:41-6). The disruption vector was constructed by 
ligation of a thiostrepton resistance cassette (Aidoo et al. suprd) into Fsel digested 
p5.72?coRI. A unique Fsel site is located within the insert 507 bp from the start of orftpar. 

35 The correct construct was obtained and used to sequentially transform S. lividans TK24 and 
then S. clavuligerus wild type. Primary transformants were selected on thiostrepton (5p.g/ml) 
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and hygromycin (25jig/ml). The mutants were put through two rounds of sporulation under 
non- selective conditions as described above and putative double cross-over mutants were 
isolated. 

5 6.3 Construction of orf3/orf3par mutants 

The orf3par disruption cassette described in section 6.2 was isolated from SAividans 
TK24 and used to transform or/3: :apra mutants. Transformants were selected on MYM 
medium containing thiostrepton (5fig/ml) and hygromycin (25|ig/ml). The mutants were put 
through two rounds of sporulation without selection and double crossover mutants isolated as 

10 previously described. 

6.4 Fermentation Analysis of or/3. orBpar and orfi/orftpar mutants 

To test the effect of disrupting orJ3, orfipar and orf3/orfSpar on clavulanic acid 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
15 Fermentation Analysis of orf 3 mutants 

Mutants Ap3-1, Ap3-2 and Ap3-3 were fermented in Soy medium and compared to 
wild type 51 clavuligerus for production of clavulanic acid. After 72hrs growth, accumulations 
in clavulanic acid were reduced by 31-71%. 

From these results it can be concluded that or/3 is required for efficient production of 
20 clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic acid 
levels. 

Fermentation of orf3par mutants 

Mutants 3 A-l and 3A-2 were fermented in Soy medium and compared to wild type S. 
clavuligerus for production of clavulanic acid. After 72hrs growth, accumulations in 
25 clavulanic acid were reduced by 9%. 

From these results it can be concluded that orf3par is required for efficient production 
of clavulanic acid as elimination of this gene by disruption causes a reduction in clavulanic 
acid levels. 

Fermentation of orf 3/orf3par mutants 
30 Clavulanic acid biosynthesis was completely abolished when mutants 11-1, 11-2, 2-1 and 2-2 
defective in both copies of the or/3 gene were grown in Soy medium and compared to wild 
type S. clavuligerus . 

These results demonstrate that under the conditions tested, both genes, or/3 and 
orf3par t contribute to clavulanic acid biosynthesis as the double disruption results in a mutant 
35 unable to make any clavulanic acid. 

6.5 Southern Analysis 
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The orf3, orfipar and orf3/orf3par mutants were further characterised by Southern 
analysis. The results confirmed that in these mutants the chromosomal copies of the relevant 
genes had been disrupted as expected. 



5 Example 7 - Analysis of orf2 and orf2var 
7.1 Construction of orf2 mutants 

Mutants disrupted in or/2 were originally made as described in United States Patent 
No. 6,332,106. These original or/2 mutants were subjected to a second round of gene 
replacement to remove the apramycin resistance gene and replace it with a simple frameshift 

10 mutation. The plasmid construct used to create the original or/2 mutant consisted of a 2.1 kb 
EcdRI/BglR fragment of S. clavuligerus DNA carried on a pUCl 19/pIJ486 shuttle vector, 
with the or/2 gene disrupted by insertion of an apramycin resistance gene cassette into a 
centrally located Notl site (United States Patent No. 6,332,106). The disruption plasmid 
construct used in the second round of mutation was derived from the original disruption 

15 plasmid by digestion with Notl to release the apramycin resistance gene cassette, treatment 
with the Klenow fragment of DNA polymerase I to fill in the overhanging ends, and then re- 
ligation to circularize the plasmid. The resulting plasmid construct carries the entire or/2 gene 
but with a frameshift introduced at the location of the destroyed Ncol site. The construct was 
used to sequentially transform SJividans TK24 and then the original SL clavuligems orf2 

20 mutant. Primary transformants were selected on thiostrepton (5jig/ml) and then subjected to 
two rounds of sporulation under non-selective conditions. Putative double cross-over mutants 
were identified based on their loss of apramycin resistance . 



7.2 Construction of orf2var mutants 
25 orflpar mutants were generated using a PCR-based targeting kit known as 

REDIRECT (trade Mark of Plant Bioscience Limited, Norwich, U.K). The plasmids pIJ790 
and pIJ773, and the host strain E. coli BW25 113 were supplied as part of the kit. For this 
particular application, a pair of oligonucleotide primers, 

KTA14: 5 * -CC ATCCCGGCGCCCGTCCGATGCGAAGGAGATCTCC ATGATTCCGG- 

30 GGATCCGTCGACC-3 • and 

KTA15: 5 ' -CGGGGCCGGGC ATGGTGAACTCGTCCTCC ACGGTGGTCATGTAGGC- 
TGGAGCTGCTT-3*, designed to disrupt the orj2par gene by insertion of an apramycin 
resistance gene, were synthesized. The orJ2par disruption cassette was generated by PGR 
using these two primers with the plasmid pIJ773 as template. PCR conditions used wereas 

35 described in the user instructions except that no dimethylsulfoxide was used. The or/2par 
disruption cassette was then introduced by electrotransformation into E. coli 
BW251 13/pIJ790 which had been previously transformed with the orf2pm* bearing cosmid 
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14E10 (described hereinabove). Cosmid DNA was isolated from transformants after 
overnight growth at 37°C to promote loss of the pIJ790 plasmid and analyzed to confirm that 
the orflpar gene had been disrupted. oiflpar disrupted cosmid DNA was then transferred into 
wild type & clavuligerus by conjugation. Conjugation was carried out as described by Kieser, 

5 T et al (2000) supra except that AS-1 medium (Baltz, R. H. Genetic recombination by 

protoplast fusion in Streptomyces. Dev. Ind. Microbiol 21 (1980) 43-54) supplemented with 
apramycin at 50 (ig/ml was used for recovery of transconjugants. Apramycin resistant S. 
clavuligerus transconjugants were subjected to one round of sporulation under non-selective 
conditions in order to generate gene replacement mutants as described by Paradkar and Jensen 

10 (1995) supra. 

7.3 Construction of orf2/orf2par mutants 

The PCR-based targeting procedure used to generate the or£2par mutants (section 7.2) 
was also used to generate or£2/or£2par double mutants. In this case the or£2par disrupted 
15 cosmid DNA was conjugated into the orf2 mutants described above (section 7.1) rather than 
into the wild type strain. Apramycin resistant S. clavuligerus transconjugants were subjected 
to one round of sporulation under non-selective conditions in order to obtain unigenomic 
mutant spores that had undergone gene replacement as previously described. 

20 7.4 Fermentation analysis of orf2. orf2par and orf2/orf2par mutants 

To test the effect of disrupting or/2, orf2par and orf2/2par on clavulanic acid 
biosynthesis, spores from each isolate were tested as previously described in section 3.4. 
Fermentation Analysis of orfl mutants 

Mutants defective in the or/2 gene were fermented in Soy medium and compared to 
25 wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, accumulations 
in clavulanic acid were reduced by 95-98% (Jensen et al (2000) supra. 

From these results it can be concluded that orfl is required for efficient production of 
clavulanic acid as elimination of this gene by disruption causes a severe reduction in 
clavulanic acid production. 
30 Fermentation analysis of orflpar disruptant 

Mutants defective in the orflpar gene were fermented in Soy medium and compared 
to wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, 
accumulation of clavulanic acid was reduced by 10-30%. 

From these results it can be concluded that, like orfl, orf2par contributes to 
35 clavulanic acid biosynthesis as elimination of this gene by disruption causes a reduction in 
clavulanic acid levels. 

Fermentation analysis of orf2/orf2par disruptants 
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Mutants defective in both or£2 and or£2par were fermented in Soy medium and compared to 
wild type S. clavuligerus for production of clavulanic acid. After 72hrs growth, no clavulanic 
acid production could be detected from the strains contain the or£2 and or£2par mutations. 
These results demonstrate that under the conditions tested, both genes, or£2 and or£2par, 
5 contribute to clavulanic acid biosynthesis as the double disruption results in a mutant unable 
to make clavulanic acid. 

5. Southern Analysis 

The orf2 t or/2par and orf2/2par mutants were further characterised by Southern analysis. 
10 The results confirmed that in these mutants the chromosomal copies of the relevant genes had 
been disrupted as expected. 

Example 8 -Analysis of cvm7 and cvm7var 

Sequence analysis had identified two additional genes in the paralogue cluster that 
15 did not have obvious paralogues in either the clavulanic acid or cvm gene clusters. It was of 
interest to determine if either of these genes was a paralogue to an as yet unidentified cvm 
gene. Therefore the sequence of the cvm cluster (W098/33896) was extended downstream of 
cvm3 (prfup3 in W098/33896). 

20 8.1 Extension of cvm cluster sequence 

The cosmid 10D7 (described in W098/33896) was digested with the restriction 
endonuclease Sad. From this digestion a 6.8 kilobase DNA fragment containing casl and 
cvml was isolated and cloned into a pUCl 19 based plasmid. The resultant plasmid pCEC019 
was used as a template to generate sequence information which allowed completion of the 

25 partial cvm3 gene reported in W098/33896. In addition, the sequence information showed the 
presence of another open reading frame, cvm7 9 which was incomplete in this fragment. In 
order to complete the cvm7 gene sequence, the next adjacent Sacl fragment from cosmid 
10D7, a 1 .9 kb fragment, was subcloned. Sequence information was obtained from the end of 
the clone which contained the remainder of the cvm7 gene, up to the point where the start 

30 codon for the cvm7 gene could be identified. In total, this resulted in the generation of a 
further approximately 3.9 kb of new DNA sequence which is described in Sequence ID 
No.17. 

8.2 Sequence analysis 

35 The size of cvm7 and its orientation relative to the rest of the cvm cluster is showed 

diagrammatically in fig2. Sequence homology searches demonstrated that this gene shares 
homology with transcriptional regulator genes. In addition cvm7 also shared 33% identity 
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with one of the two genes identified in the paralogue cluster that did not have any obvious 
paralogies within the known clavulanic acid or clavam biosynthetic genes. Therefore since 
cvm6 and cvm6par have been shown to be paralogues, from this sequence data it can be 
concluded that cvm7 and cvm7par are paralogues of genes involved in 5S clavam 
5 biosynthesis. 

Brief description of the figures 

Figure 1. Diagram of the paralogue cluster. The orientation of transcription is shown for each 
gene (direction of arrow) 
10 Figure 2. Orientation of cvm7 in relation to published cvm cluster (W098/33896). 
Figure 3. Annotated seqence of the paralogue cluster 

Brief description of the sequences 

SEQ ID NO: 1 cvm6para open reading frame 
1 5 SEQ ID NO:2 cvm7para open reading fame 

SEQEDNO:3 cvm6para polypeptide 

SEQ ID NO:4 cvm7para polypeptide 

SEQ ED NO : 5 cvm6 open reading frame 

SEQ ID NO: 6 cvm7 open reading frame 
20 SEQ ID NO:7 cvml open reading frame 

SEQ ID NO : 8 cvm2 open reading frame 

SEQ ID NO: 9 cvm3 open reading frame 

SEQ ID NO: 10 cvm4 open reading frame 

SEQ ID NO: 1 1 cvm5 open reading frame 
25 SEQ ID NO: 12 orf2para open reading frame 

SEQ ED NO: 13 orf3para open reading frame 

SEQ ID NO: 14 orf4para open reading frame 

SEQ ID NO: 1 5 orfSpara open reading frame 

SEQ ID NO: 16 paralogue cluster 
30 SEQ ID NO: 17 extended cvm cluster (underlined sequence denotes new sequence over that 

disclosed in W098/33 896 

SEQ ID NO: 1 8 or£2para open reading frame (reverse complement) 
SEQ ID NO: 19 orf3para open reading frame (reverse complement) 
SEQ ID NO:20 orf4para open reading frame (reverse complement) 
35 SEQ ID NO:21 cvm6 polypeptide 
SEQ ID NO:22 cvm3 polypeptide 
SEQ ID NO:23 orf6para polypeptide 
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Sequences 

SEQ ID NO: 1 cvm6para 

ATGTTCCACCCGGTCCTGCCCCGGGGCCGCGAGGACCGCACCGTTCTGGTCTCCGGCCGCGGCTGCACCGTACGGG^ 
CGAAGGGCGCACCTATCTCGACGCCTCGTCGGTGCTCGGACT 
5 CCGCCGCCGAGCAGATGCGGACACTCGGTCACTTCCACACCTGGGGC^ 

GCGCGCCTCACCGACCTGGCGCCCCAGGGTCTCCAGCGCGTCTACTTCACCAGCGGCGGCGGCGAGGGCGTCGAGATCGC 
CCTGCGCATGGCCCGTTACTTCCACCACCGCACCGGCAGCCCGGAGCGCACCTGGATCTTGTCGCGCCGCACCGCCTACC 
ACGGCATCGGCTACGGCAGCGGTACGGTGTCGGGCTCGCCCGCCTACCAGGACGGGTTCGGCCCGGTGCTGCCCCATGTG 
CACCACCTCACGCCGCCCGACCCGTACCACGCCGAGCTGTACGACGGCGAGGACGTCACGGAGTACTGCCTGCGCGAACT 

10 CGCCCGCACCATCGACGAGATCGGCCCCGGGCGGATCGCCGCGATGATCGGGGAGCCGGTCATGGGCGCGGGCGGCGCCG 
TCGTCCCGCCGCCGGACTACTGGCCGCGCGTCGCCGCGCTGCTGCGCTCCCACGGCATCCTGCTGATCCTGGACGAGGTC 
GTCACCGCGTTCGGCCGCACGGGGACCTGGTTCGCGGCCGAGCACTTCGGGGTGACCCCCGATCTGCTGGTGACCGCGAA 
GGGCATCACCTCCGGGTATGTCCCGCACGGGGCGGTGCTCCTGACCGAGGAGGTCGCGGACGCCGTGAACGGGGAGACGG 
GGTTCCCGATCGGCTTCACCTATACCGGTCACCCCACGGCGTGCGCCGTCGCGCTCGCCAATCTCGACATCATCGAACGG 

15 GAAGGGCTGCTGGAGAACGCGGTGAAGGTGGGCGACCACCTCGCCGGGCGGCTGGCGGCCCTGCGCGGGCTGCCCGCCGT 
GGGGGACGTCCGGCAACTGGGCATGATGCTCGCCGTCGAGCTGGTGTCGGACAAGACGGCCCGCACCCCGCTGCCGGGCG 
GCACCCTCGGGGTCGTGGACGCGCTGCGCGAGGACGCGGGCGTCATCGTCCGGGCCACGCCGCGCTCCCTGGTCCTCAAT 
CCGGCGCTCGTGATGGACCGGGCCACGGCGGACGAGGTGGCGGACGGGCTGGACTCGGTGCTGCGGCGGCTGGCACCCGA 
CGGGCGGATCGGCGCGGCCCCCCGGCGGGGGTGA 

20 

SEQ ID NO:2 cvm7para 

GTGTACGAGTGCAGCGATGAGGTTCGTCACGACGTCCCCGGCCTGCCGGGTCCGTCACCGTCCATCACCGTCCTGGGCTG 
TCTGGGCGTACGCGCCGACGGCCGGAAACTGGAGCTGGGCCCTCCGCGTCAGCGGGCCGTTTTCGCCCTGCTGCTCATCA 
ACGCGGGCAGTGTGGTGCCGGTCGACTCGATCGTCTTCCGTATCTGGGGCAACTCACCACCGGGCGCGGTCACCGCGACG 

25 CTCCAGTCCTATGTGTCCCGGCTGCGGAAACTCCTGGCCGAGTGTGTGCTCCCGGACGGTTCGACACCCGAACTGCTGCA 
CCAGCCGCCGGGCTACACCCTCGCGCTCGGCACCGAGCACATCGACGCGAACCGTTTTGAGCAGGCCATCAGGACAGGGC 
GCCGGCTCTCGCGCGAGGAGCAGCACCAGGAGGCGCGGGCCGTGCTCTGCCAGGCCCTGCTGAGCTGGGGCGGGACACCG 
TACGAGGAGCTGAGCGCGTACGACTTCGCCGTCCAGGAGGCCAATCGGCTGGAGCAGCTCCGGCTGGGCGCCGTGGAGAC 
ATGGGCGCACTGCTGTCTGCGGCTGGGGCGGGACGAGGAGGTGATGGACCAGCTCAAGCCGGAGGTGCAGCGCAATCCGC 

30 TGCGGGAGCGGCTGATCGGGCAGCTCATGCAGGCGCAGTACCGGCTGGGGTGCCAGGCGGACGCGCTCAGGACGTACGAG 
GCGACGCGGCGGGCCCTGGCCGAGGAGCTGGGGACCGATCCGGGCAAGGAGCTGGCGGCGCTGCACGCGGCGATCCTGCG 
TCAGGACAACGGTCTGGACCGCGTCGTCCCGGCGTCCGCGCCGCCGTCGGCGGGGGTCGGGCGGGGGGCCGTGACGGTGT 
CGGTCCCGGCACAGCGGTCGAGGCCGTTGACGCGGCCGGTGGCGGGGCGGGCGCGGGTCCCGGGGGCGATGACGGTGGCG 
GCGGGCGCGGGGGCGGCCCCCGCGTCCGCCTCCGGCTCCGTTTCCGCGTCCGTTTCCGGCTCCGGCTCCGGCTCCGGCTC 

35 CGCTCCTGCGTCGGTTCCCACCTTCTTTCCCGGCTCCGTTTCTGGCTCGGCGTCCGTTGCCGCGTCCGTAGCCGCGCCCG 
TTTCCGGCCATGTCTCCGGGCCCGGGTCCGCTTTCGGGTCCGTGGCGCTCCACCGGCCGCAGACCCTCCGGGGCGAGCCG 
GTCCACGGGGGCGCGCAGGGGATGCGCACCGGGCAGGTGTTCCCCACGCTGCCGCCGTTCGTCGGGCGCGGCGACGAGCT 
GCGCGGTCTGCTGGAGTCCGCGACGTCCGCGTTCCACACCTCGGGGCGGGTGGCGTTCGTCGTCGGCGAGGCGGGCAGCG 
GCAAGACCCGGCTCCTCTCCGAGTTGGAGCGCTCGGTTCCGGACAGTGTGCGCACCGTCTGGGCGTCCTGTTCGGAGAGT 

40 GAGGACCGGCCCGACTACTGGCCGTGGACGACCGTGCTGCGGCATCTGTACGCGATGTGGCCGGAACGTATGCACGGATT 
CCCCGGTTGGCTGCGGCGCGCACTCGCGGAACTGCTTCCCGAGGTGGGCCCGGAGCCACAGGGGCCGCACTCCCCCGACG 
GGGGCGAGGAGAACAGCGGCAACGGGGACGGTGCGGGCGACGGGGACAGCACCCCGGCGCACACCCTCACGCTCGCGCCC 
GCTCTCGCGCCCCCGCGCTCCAGAGAGGCTCGTTTCACCCTGCACGACGCCGTGTGCCAGGCGCTTCTGCGCACGGTCCG 
CGAACCCGTGGTGATCATGCTGGAGGACATGGAGCGGGCCGACGCCCCCTCGCTCGCCCTGCTGCGCCTCCTGGTGGAGC 

45 AACTGCGCACCGTCCCCCTGCTGCTCGTGGTCACCACGCGCACCTTCCGGCTCGCGCACGACGCCGAGCTGCGACGGGCC 
GCCGCCGTGATCCTCCAGTCGACCGGCGCGCGCCGGGTCCTGCTGAACGCCCTGGACGCACGGGCCACCGGGGAACTCGC 
CGGAGGGATGCTGGGCAAGGCCCCGGACACCCTCCTCGTACGGGCCCTGCACGAGCGCTCCGCCGGGAACCCGTACTTCC 
TCGTCCAGCTCCTCCGCTCGCTCCGGCAGGGGCTCGCCGCCGCCTGGGAGACGGAGATCCCGGACGAGCTGGCCGGGGTC 
GTGCTGCAACGGCTGTCGAGCGTGCCGCCCGCCGTGCGCCGGGTGCTCGACATCTGCGCGGTCGTGGAGCGCAGTTGCGA 

50 ACGGCGTGTGATCGAGACCGTGCTGCGCCATGAGGGAATCCCGCTGGAGAACGTCCGTACGGCGGTCCGCGGCGGTCTGC 
TGGAGGAAGACCCCGACGACCCCGGGCGGCTGAGGTTCGTGCATCCGCTGGTCCGGGAGGCCGTCTGGGACGACCTGGAG 
AACACCCGTCGGCCCGTSTCVMARGTCCCGTTCCTCCGCGCTCGGGGCGCTGGCCACGGTCTGA 

SEQ ID NO:3 cvm6para polypeptide 

55 MFHPVLPRGREDRTVLVSGRGCTVRDTEGRTYLDASSV^ 

ARLTDLAPQGLQRVYFTSGGGEGVEIALRMARYFHHRTGSPERTWILSRRTAYHGIGYGSGTVSGSPAYQDGFGPVLPHV 
HHLTPPDPYHAELYDGEDVTEYCLREIjARTIDEIGPGRIAAMIGEPVMGAGGAWPPPDYWPRVAALLRSHGILLILDEV 
VTAFGRTGTWFAAEHFGVTPDLLVTAKGITSGYVPHGAVLLTEEVADAVNGETGFPIGFTYTGHPTACAVAIiANLDIIER 
EGLLENAVKVGDHIAGRLAALRGLPAVGDVRQLGMMLAVELVSDKTARTPLPGGTLGWDALREDAGVIVRA^ 

60 PAIiVMDRATADEVADGLDSVLRRLAPDGRIGAAPRRG 

SEQ ID NO:4 cvm7para polypeptide 

VYECSDEVRHDVPGLPGPSPSITVLGCLGVRADGRKLELGPPRQRAVFALLLINAGSVVPVDSIVFRIWGNSPPGAVTAT 
LQSYVSRLRKLLAECVLPDGSTPELLHQPPGYTLALGTEHI 
65 YEELSAYDFAVQEANRLEQLRLGAVETWAHCCLRLGRDEEVMDQLKPEVQRNPLRERLIGQLMQAQYRLGCQADALRTYE 
ATRRALAEELGTDPGKELAALHAAILRQDNGLDRWPASAPPSAGVGRGAVTVSVPAQRSRPLTRPVAGRARVPGA 
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AGAGAAPASASGSVSASVSGSGSGSGSAPASVPTFFPGSVSGSASVAASVAAPVSGHVSGPGSAFGSVALHRPQTLRGEP 
VHGGAQGMRTGQVFPTLPPFVGRGDELRGLLESATSAFHTSGRVAFWGEAGSGKTRLLSELERSVPDSVRTVWASCSES 
EDRPDYWPWTTVLRHLYAMWPERMHGFPGWLRRALAELL 

AIiAP PRSREARFTLHDAVCQALLRTVRE PWIMLEDMERADAPSLALLRLLVEQLRTVPLLLVVTTRT FRLAHDAELRRA 
5 AAVI LQSTGARRVLLNALDARATGELAGGMLGKAP DTLLVRALHERS AGN PYFLVQLLRSLRQGLAAAWETE I PDELAGV 
VLQRLSSVPPAVRRVLDICAVVERSCERRVIETVLRHEGIPLEITVRTAVRG 
NTRR PV S RS S ALGALATV 

SEQIDNO:5 cvm6 

10 GTGCCCGGCTCCGGACTCGAAGCACTGGACCGTGCCACCCTCATCCACCCCACCCTCTCCGGAAACACCGCGGAACGGAT 
CGTGCTGACCTCGGGGTCCGGCAGCCGGGTCCGCGACACCGACGGCCGGGAGTACCTGGACGCGAGCGCCGTCCTCGGGG 
TGACCCAGGTGGGCCACGGCCGGGCCGAGCTGGCCCGGGTCGCGGCCGAGCAGATGGCCCGGCTGGAGTACTTCCACACC 
TGGGGGACGATCAGCAACGACCGGGCGGTGGAGCTGGCGGCACGGCTGGTGGGGCTGAGCCCGGAGCCGCTGACCCGCGT 
CTACTTCACCAGCGGCGGGGCCGAGGGCAACGAGATCGCCCTGCGGATGGCCCGGCTCTACCACCACCGGCGCGGGGAGT 

15 CCGCCCGTACCTGGATACTCTCCCGCCGGTCGGCCTACCACGGCGTCGGATACGGCAGCGGCGGCGTCACCGGCTTCCCC 
GCCTACCACCAGGGCTTCGGCCCCTCCCTCCCGGACGTCGACTTCCTGACCCCGCCGCAGCCCTACCGCCGGGAGCTGTT 
CGCCGGTTCCGACGTCACCGACTTCTGCCTCGCCGAACTGCGCGAGACCATCGACCGGATCGGCCCGGAGCGGATCGCGG 
CGATGATCGGCGAGCCGATCATGGGCGCGGTCGGCGCCGCGGCCCCGCCCGCCGACTACTGGCCCCGGGTCGCCGAGCTG 
CTGCACTCCTACGGCATCCTGCTGATCTCCGACGAGGTGATCACGGGGTACGGGCGCACCGGGCACTGGTTCGCCGCCGA 

20 CCACTTCGGCGTGGTCCCGGACATCATGGTCACCGCCAAGGGCATTCACCTCGGGGTATGTGCCGCACGGCGCCGTCCTG 
ACCACCGAGGCCGTCGCCGACGAGGTCGTCGGCGACCAGGGCTTCCCGGCGGGCTTCACCTACAGCGGCCATGCCACGGC 
CTGCGCGGTGGCCCTGGCCAACCTGGACATCATCGAGCGCGAGAATCTGCTCGACAACGCCAGCACCGTCGGCGCCTACC 
TGGGCAAACGCCTGGCCGAGCTGAGCGATCTGCCGATCGTCGGGGACGTCCGGCAGACCGGTCTGATGCTCGGTGTCGAA 
CTGGTCGCCGACCGCGGAACCCGGGAGCCGCTGCCGGGCGCCGCCGTCGCCGAGGCCCTGCGCGAGCGGGCGGGCATCCT 

25 GCTGCGCGCCAACGGCAACGCCCTCATCGTCAACCCCCCGCTGATCTTCACCCAGGAAGACGCCGACGAACTCGTGGCGG 
GCCTGCGCTCCGTACTCGCCCGCACCAGGCCGGACGGCCGGGTGCTCTGA 

SEQ ID NO:6 cvm7 

ATGAAGTACGACATAACCCCACCATCCGGCCTTCGGTTCGACCTCCTCGGCCCGTTGACCGTGACCGCCGGCGAGCAACC 
30 CGTGGACCTGGGCGCGCCACGGCAGCGCGCCCTGCTCGCCCTGCTGCTCATCGATGTCGGCAACGTGGTCCCGCTGCCGG 
TCATGACCGCGTCGATCTGGGGGGCCGACCCACCGTCCCGGGTCCGGGGGACGCTCCAGGCTTATGTGTCCCGACTGCGG 
AAACTCCTGCACCGCCATGACCGTTCCCTTCGCCTTGTCCACCAGCTCCAGGGGTATCTCCTCGAAGTGGATTCGGCGAA 
GGTGGACGCCGTGGTTTTCGAGACACGTGTCAGGGAGTGCCGGGAATTGAGCAGGGCCCGGAACCCCGAGGCCACCCGGG 
CCGTGGCCTGGTCCGCCCTGGAGATGTGGAAGGGCACACCCATGGGCGAGCTGCATGATTATGAATTTGTGGCGGCGGAG 
35 GCCGACCGGCTGGAAGGAATCCGGTTACGCGCGCTGGAGACCTGGTCCCAGGCGTGTCTCGATCTCCAGCACTATGAAGA 
GGTTGCATTTCAGCTCGGCGAGGAGATCCACCGCAATCCGGAACTGGAACGGCTGGGCGGTCTCTTCATGCGGGCCCAGT 
ATCATTCCGGACGGTCGGCGGAAGCCCTGTTGACGTATGAACGTATGCGTACCGCGGTGGCGGAGAATCTGGGGGCCGAT 
ATCAGTCCGGAGCTCCAGGAACTCCATGGAAAGATTCTGCGCCAGGAACTCACGGAGACACCCGCCGCGCGATCGACGGC 
CTCCCTCACACGGGCGGCGGGCCCGCACGGGCCCCCGCCCCTGGCCGAAACCGGCACCCCCGCCGCACCCGCGGACATGG 
40 CCGAAACCACGGTGGCGGAGGAAAGCGCCGCGCCCCCCGCCCCGGCGGCGCCCGGGACCCCGCCCCCCATGCCGTCCCCC 
GTACCGCTCCCCCATCCGTCAGGGGCCGTCCCGCCGGTCACCCCGGTGCCTCCCCCGGTCCCCCGCTCGGCCCTCCGTTC 
AGCGGCACCCGCCGAGACCGAGGACCCGGAACCGGCGCCGCCCCCTCCCCCTCCGCCGGGCGGCCGACTCATCGGCCGCC 
GCGCCGAACTGCGCAGGCTGCGGCTGCTGCTGACGAAGACCCGCGCGGGCCACGGCCATGTCCTGCTGGTCTGCGGCGAA 
CAGGGCATCGGGAAGACCCGGCTCCTGGAGCACACCGAGCACACCCTGGCCGCGGGCGCGTTCCGGGTGGTCCGTTCGCA 
45 CTGCGTCGCCACCCTCCCGGCACCGGGCTACTGGCCCTGGGAGCACCTCGTACGCCAGCTCGACCCGGACAGCGGCCTCG 
GTGACGACGGCGACGCCGACCCCGTCGCCCAGGCCGAGTGGCTGCCGGAACACCACCTCACCCACCAGATGCGGATCTGC 
CGGACGGTGCTCGCCGCGGCGCGGCGGACCCCGCTCCTGTTGATCCTGGAGGATCTGCACCTCGCCCACGCGCCGGTCCT 
GGATGTGCTCCAGCTCCTGGTCAAACAGATCGGCCAGGCCCCCGTCATGGTCGTCGCCACCCTGCGCGAGCACGATCTCG 
CCCGGGACCCCGCCGTCCGCCGGGCCGTGGGCCGCATCCTCCAGGCGGGCAACACCGGCACCCTCCGGCTGGACGGGCTC 
50 ACCGAGGAGCAGAGCCGGGAGCTGATCGTCTCGGTCGCGGGGGCCCCGTTCGCGCCCCATGACGCCCAACGGCTCCAGCG 
CGCCTCGGGCGGCAACCCGTTTCTGCTGCTCAGCATGGTCACAGGGGAGGACGGCACCCAGGAGTGGGCACGGCCGTGCG 
TCCCGTTCGAGGTGCGCGAGGTGCTGCACGAGCGGCTGAGCGAATGCTCCCCGTCCACCCAGGACGTGCTCACGCTCTGC 
GCCGTGCTCGGCATGAGCGTGCGCCGACCGCTGCTCACCGACATCATGTCCACGCTCGACATCCCGCACACCGCGCTCGA 
CGACGCGCTCGGCACGGGGCTGCTGCGCCACGACCGGAACACCGACGGAATGGTCCACTTCGCCCATGGGCTGACCCGGG 
55 ACTTCCTGCTCGACGACACCCCGCCGGTCACCCGCGCCCGCTGGCACCACCGGGTCGCCGCCACCCTCGCCCTGCGCTTC 
CAGCAGGGCGACGACCACGCCGAGATCCGCCGCCACTGTCTGGCCGCGGCCCGTCTGCTCGGCGCCCGCGCGGGGGTGCG 
CCCCCTGCTGGCGCTGGCCGACCGGGAGCAGTCCCGCTTCTCCCACGCGGAGGCGCTGCGCTGGCTGGAGAGCGCGGTCG 
CGGTCGTCGCGGCGCTGCCCCGGGACCAGCCGGTGTCCGCCGTCGAACTCCAGTTGCGCAAACGGATGATGGCGCTGCAC 
GCGCTGATGGACGGCTATGGATCGGCCCGCGTCGAGACGTTCCTCTCCCAGGTCACCCAGTGGGAACACGTCTTCGACAA 
60 CACCCAGCCCACCGGGCTGCTGCACGTCCAGGCGCTGAGCGCGCTCACCACGGGCCGCCATGAGCAGGCGGCGGAGCTGG 
CCGGGCTGCTGCACGAGCTGGCCGACCACGGCGGCGGACCGGAGGCCCGGTCGGCGGCCTGCTATGTGGACGGCGTCACC 
CTGTATGTGGGCGGACGGGTCGACGAAGCCCTCGCCGCGCTCGCCCAGGGCACCGAGATCACGGACGCCCTCCTGGCCGG 
ACACCGCAGGACCGCCGCCCCGCACGGCGGCGGGCACCTCCAGGACCGGCGTATCGACTTCCGCGCCTATCTGGCGCTCG 
GCCACTGTCTCAGCGGCGACCGGATTCAGACCCAGCGCTACCGGACGGAACTCCTCCACCTCACCCAGTCGGAACGGTAC 
65 GACCGGCCGTGGGACCGGGCCTTCGCCCGCTATGTGGACGCGCTCATCGCCGTCACGGAGTGCGATGTCCAGGGGGTGTG 
GCTGGCCGCGCGGGCGGGGCTCGACCTCGCCGCCCGCTGCCAGCTCCCGTTCTGGCAGCGGATGCTCGCCGTCCCCCTCG 
GCTGGGCCGAGGTCCACCAGGGGGCGCACGACAAGGGGCTGGCCCGGATGCGGGAGGCGCTGCACGAGGCGGCCCGGCAC 
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CGGACCCTGCTGCGCCGTACGCTCCACCTCGGCCTGCTCGCCGACGCCCTCCAGTACACGGGCGCCCGGGAACAGGCCCG 
GCGCACGATGTCCTCCGCCGTACGGGAGATCGAGCGCCGCGGCGAGTACTTCTGTCTCCGGCCGCAGTGGCCCTGGGCCC 
GGCTCCTCCACAGCCACGGCACCTCCGCCGCGGCGGAGCACCGGGTCGTCCACGGCAGGCACTGA 

5 SEQIDNO:7cvml 

ATGTCCCGCTCTCCGCCCGAGTCCCCGGCCGGTTCCGTGTCCGCCGCGGTTCCGCGTCCGCCGGTCCGCGCCCTGCGGGA 
CCTTCCGGTCAGTGCCCAGGGGCTCGGCTGCCTGCCGACCACCGACTTCTACGGACGCCCGGACCGCGCCCGGGCGACGG 
CCACCATCCGCGCCGCCGTCGACGCCGGGGTCACCCTGCTGGACACCGCCGACGTCCAGGGGCTCGGCGCCGGTGAGGAG 
CTGCTCGGACGGGCGGTCGCGGGCCGCCGGGACGAGGTGCTGATCGCCACCAAGTTCGGCATGGTGCGCTCGTCCGACGG 

10 CGCCTCCCAGGGCTTGTGCGGCGAGCCGTCCTACGTCCGCGCGGCCTGCGAACGGTCCCTGCGTCGTCTCGGCACCGACC 
GCATCGACCTGTACTACCAGCACTGGACGGACCCGGCGGTGCCGATCGAGGAGACCGTGGGTGCGGTGGCCGAGCTGGTG 
CGCGAGGGCAAGGTCCGCAGGCTCGGTCTCTCCGAGCCCTCCGCGGCCACGCTGCGCCGGGCGGACGCGGTGCACCCGGT 
GACGGCGGTGCAGAGCGAGTGGAGCCTGTGGTCGCGCGGGATCGAGGACGAGGTGGTGCCCGTCTGCCGGGAGCTGGGGA 
TCGGGATCGTCGCTTACGCCCCTCTGGGACGGGGTTTTCTCACCGGCACCATCCGCACCACCGACGATCTGGGGGACGAG 

15 GACTTCCGCCGGGGCCAGCCCCGGTTCAGCGCTCCGGCCCTCGCGCGCAACCGCTCGTTGCTGCACCGGCTGCGCCCGGT 
CGCGGACGGTCTGGGGCTGACCCTGGCACAGCTCGCGCTCGCCTGGCTGCACCACCGGGGCGAGGACGTCGTCCCGATCC 
CGGGCACCGCGAACCCGGCCCATCTCGCGGACAATCTCGCCGCCGCCTCGATCCGGCTGGACGACCGGTCCCTCGCGGAG 
GTGACGGCCGCGATCTCCCACCCGGTGTCCGGGGAGCGGTACACCCCGGCATTGCTCGCCATGATCGGCAACTGA 

20 SEQIDNO:8cvm2 

ATGTCCGTGGCATCGGCCGGTATGACGGACGAGCAGCGCAAGGCGGTCATCACCGCGTACTTCAAGGCGTTCGACAACGG 
CGGCGTCGGCAGCGACGGCACCCCCGCGATCGACTACTTCGCCGAGGACGCGGTCTTCTTCTTCCCCAAGTGGGGTCTGG 
CCCGGGGCAAGTCCGAGATCGCCCGGCTCTTCGACGACCTCGGGGGCACCATCCGCTCGATCACCCACCATCTGTGGTCC 
GTCAACTGGATTCTGACCGGGACCGAACTCCTCGCCGCGGAGGGCACCACCCACGGTGAGCACCGGGACGGGCCGTGGCG 
25 GGCGGGTGACCCCGAGTGGGCCGCCGGGCGCTGGTGCACGGTCTACGAGGTGCGGGACTTCCTCGTCCACCGGGCCTTCG 
TCTATCTGGACCCCGATTACGCGGGCAAGGACACCGCGCGTTACCCGTGGCTGTGA 

SEQBDNO:9cvm3 

GTGACCCGGCCTCCGGGCCTTTCCGCGCACACCCACGGGTCCGTGTCCGGGAGTCTGCTGCGCCGGGTGGCGGGCCACTA 
30 TCCCACCGGGGTGGTCCTGGTCACCGGTCCGGCCGAGGCTCCGGGGCAGCCGCCGCCCGCCATGGTGGTGGGGACGTTCA 
CCTCGGTGTCGCTCGATCCGGTGCTGGTGGGTTTCCTCCCGGCCAGGTCGTCGACGACCTGGCCGCGGCTCCGGGCGGCC 
GGGCGTTTCTGCGTCAATGTGCTCGGCGCGGATCAGGGCCCGGTCTGCCGGAGTTTCGCCGGGGGCGATCCGGGGCGCTG 
GGAGGTGCCGTACCGGACGACGGCCACCGGCTCCCCCGTCCTGCTCGACGCGCTCGCGTGGTTCGACTGCGAGGTGGCGG 
GGGAGACGGAGGCGGGCGACCACTGGTTCGTCACCGGGGCGGTGCGCGACCTCGGGGTGATCCGCGAGGGTTCGCCCCTG 
35 GTCTTCCTGCGGGGCGACTACGGGCACTGGGCCGGGGGCGGCGGCTCGGGCCGGGCGGGGCGGCGGTCCGCCGTCTGCCC 
GGTCTGA 

SEQEDNO:10cvm4 

GTGGAATGCCGCATATTCGAGATCGACGAACTGCCGTTGCTGGACGGGGAGGTCCTGCGGGACGCCCGGATCGGTTACGC 
40 CATGTACGGCACGCCGAACGCCGACGGGACGAACGTGGTGCTCTGTCCGTCGTTCTTCGGCCGGGACCACACCGGGTACG 
ACTGGCTGATCGGTGCGGGGCTGCCGCTGGACACCCGGCGGTACTGCGTCGTCACCGCCGGACTCTTCGGCAACGGGGTC 
TCCAGCTCGCCCGGCAACCACCCGTCGGGGTCCCGCTTTCCGCTGATCACTCCGCAGGACAATGTCGCGGCGCAGCACCG 
GCTGCTCACCGAGGAGCTGGGGGTACGGGAACTGGCCCTGGTCACGGGCTGGTCGATGGGCGCGGCCCACGCCTACCAGT 
GGGCCGTGTCGCATCCGGGGATGGTGCGCCGGATCGCCCCGATCTGCGGGGCGCCGGTGAGCAGCCCGCACAGCCTGGTC 
45 CTGCTGTCCGGTCTGGCCGCGGCGCTCAGCGCCGACGCCGGGGAGCGGGGGCGGAAGGCGGCGGGCCGGGTGTTCGCCGG 
GTGGGGGACCTCGCGTTCCTTCTGGGCCCGCCGTGCCCACCGGGAGCTGGGTTTCGCCACCCGCGAGGAGTACCTCACCG 
GCTTCTGGGAGCAGGTCTTCCTCTCCGGGCCCGGCGCCGCGGATCTGCTCACCATGGTGCGCACCTGGGAGAACACGGAT 
GTGGGGGCGACACCCGGGGCCGGGGGGAGCGTCGAGGCGGCGCTGGCCTCCGTCACGGCGCGGGCCGTGGTGCTGCCGGG 
CGCCCTGGACGTGTGTTTCGCCGTCGAGGACGAGAAGCGGGTGGCCGATCTGCTGCCGTATGCCTCGCTGGAGGTGATCC 
50 CGGGAGTGTGGGGGCATCTCGCGGGGTCCGGGGGGTCGGCCGCCGACCGGGAGTTCATCGGGGGCGCGCTGCGGCGGCTG 
CTGGACAGCCCGGTGGACGGGGGCTGA 

SEQIDNO:llcvm5 

GTGAAGTCCATTCTCTTCTATCTGCCAACGGTCGGCAGTCATGCGCAGGTCCAGCGGGGTATGGCGGGGGTCAATCCGCA 
55 GAACTACCAGAACATGCTCCGGCAGCTCACCCGGCAGGCGCAGGCGGCCGACGAACTCGGCTACTGGGGACTGTCCTTCA 
CCGAGCACCACTTCCACACCGAGGGTTTCGAGGTCTCCAACAACCCGATCATGCTGGGGCTCTACCTCGGCATGCAGACC 
CGGCACATCCGGGTCGGCCAGATGGCCAACGTCCTGCCGCTGCACAATCCGCTGCGGCTGGCCGAGGATCTGGCGATGCT 
CGACCACATGACCCGGGGCCGCGCCTTCGTCGGGATCGCGCGCGGGTTCCAGAAGCGCTGGGCCGACATCATGGGGCAGG 
TGTACGGGGTCGGCGGCACCCTGTCCGACGCCGGGGAGCGGGACCGGCGCAATCGTGCCCTCTTCGAGGAGCACTGGGAG 
60 ATCATCAAGAAGGCGTGGACGACCGAGACGTTCACCCACTCCGGGGAGCAGTGGACGATCCCGGTGCCGGACCTGGAGTT 
CCCCTACGAGGCGGTGCGCCGCTACGGCCGGGGCCTCGACGAGAACGGCGTCATCCGCGAGGTGGGCATCGCGCCCAAGC 
CCTACCAGCGCCCCCACCCGCCCGTCTTCCAGCCGTTCAGCTTCAGTGAGGACACGTTCCGGTTCTGTGCCCGGGAGGGC 
GTGGTGCCGATCCTGATGAACACCGACGACCAGATCGTCGCCCGGCTGATGGACATCTACCGGGAGGAGGCCGAGGCGGC 
GGGCCACGGCACCCTGCGGCGGGGCGAGCGGGTCGGGGTGATGAAGGACGTCCTGGTCTCCCGGGACTCCGGCGAGGCCC 
65 ACCACTGGGCGTCCCGCGGCGGCGGCTTCATCTTCGAGAACTGGTTCGGCCCCATGGGCTTCACCGAGGCGCTGCGCGCG 
ACCGGCGAGACGGGTCCGATCGGCTCGGACTACAAGACCCTGGTCGACCGGGGGCTGGAGTGGGTCGGCACCCCGGACGA 
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C^TCAACCGCATGATCGAGAAGCTGGTGGAGCGGCACGATCCGGAGTATCTGCTCCAGTGCC^GTACTCCGGGCTGATCC 
CGCACGATGTCCAGCTGCGC^GCCTGGAGCTGTGGGCCACCGAGATCGCCCCC^\ACTGGCTCTGA 

SEQ ID NO: 12 orGpara 

5 TCAGATGGCCAGGGCGGCGAAACCGCCGGACTGGAAGTCGTAGGCCACCGGTACCTCGATCAGGAACGGGCGGCCGAGTC 
CGGCGCCCTTGGTGAGGGCGGCGAGCAGCGAGGTGCGGTCGGTGGCGCGGACGGCCTCGCAGCCGTTGGCCTCGGCGAGC 
TGGACGAAGTCGACGCTTCCGAAGCCGACGGCGGGGGCGTGGGAGCGCTGGTGTCCGAGGTTCTGGTACAGCTCGATCAG 
GCCGTTGCGGTCGTTGTTGACGACGACCATGACGATCGGCAGGCCCAGGCGCACGGCCGTCTCGATGTCGGCGCTGTTGG 
AGTGGAAGCCGCCGTCGCCCGCGATGAGGAAGACGGGCTCGCCGGGCCGGGCGATCTGGGCGGCCATGGCGGCGGGCAGT 

10 CCGTAGCCGAAGCTGGAGCAGCCCGCGGAGGTGAGGAATCCGTACGGCTGGTCGGACTTGGCGAAGAGCACGCCGTAGTG 
GCGGAAGAAGCCGATGTCGCTGACGAAGGTGCCGTTGTCGAGGACGGAGTTCATGCAGTCGATCACCTGGTGGACCCGCA 
TGCCGTCCTCGTACTCGGTGGGGTCGGCGAGGAATTCGGCGACGCGGGCGCGCAGGGCGCTGAGGTCGTGCCGGGTCTTG 
GGGGCGAGGCCCGAGGTCGCGTCGTCGAGCGCGGTGACGAATTCGGCGACGTTGGTGACGATGTCGATGTCGGCGCGGAA 
CAGCTCCGGGATCGGGTTGACCTCGGGGGCGACCCGGACCGTGGTCTTGGCCCGGCCCCGCGTCCACATGGAGGGGCGCA 

15 GGTCCTCGGCGTAGTCGTAGCCGATCGCCAGGAGGAGGTCGGCGGGGCCGAAGATCTCGTCGAGGGCCGGGTGGCCGAGA 
ATGCCGTCCATGTAGCCGCTGATGGCGCCGTAGTTGAGCGGGTGGTCGTGCGGCAGGACGCCCTTGGCGGTGTAGGTGGT 
GACGACGGGGATGTTCAGCCGCTCGGCGAGGGCGCGCAGGGCGTCGACGGCCCCGGCGCGGATGACGGCGCTACCGACGA 
CGAGGAGGGGGTTCTCGGCCTCGCGCACCAGCTCAGCGGCCTCGTCGAGGCGGGCGCGCCAGTCGGCGTCCAGGGCGTGG 
GTGGCGGTGGCCCGGACCAGGGGGGCGTCGGTGGGGGTGCCGTTCAGCTCGGCGCCGAGGAGGTCGACCGGCAGGCTGAT 

20 GAAGCTGGGACCCACGGGCTCGATCCGGCTGTTGAGGACGGCGCTGTCGACGAGGTTGACGATGTCCTCGCCGCGTTCGA 
GCTGGACGCTGAACTTGGTCAGCGGGCCCATCACGGCGGTGCTGTCCAGGCACTGGTGGGTGACGTTGGGGTAGCAGTCG 
TACGACTCGGACTGCGCGGCCAGCGCGATGACCGAGCTGCGGTCCAGGGCGGAGGTGGCGACGCCGGTGGCCAGGTTGGT 
CATGCCGGGGCCCAGGGTCGCGAAGCACGCCTGGGGGCGGTTGGTGATCCGGGCGAGGACGTCCGCCATCACCCCGGCGG 
TGAACTCGTGCCGGGTCAGGACGAAGTCGAGTCCTTCGACCTCGTCGAAGAGAATGGCGGACGCCTCCCGGCCGACGACG 

25 CCGAATACATGGTCGACACCGTACTGGTGAAGACGTTCCAGCATGGCTTTCGCGGTCGTGGTGGCCAT 

SEQ ID NO: 13 orf3para 

TCATACGACCACCCGGCCCTGGAGCCTGAGCCTGCGCACCGCGTCGACGGAGCGCCGCACCGTCTCGCCGAAGTCCACGT 
CCTCCGGCGGCACCGTGTCGATGACCACCGCGTCGTACAGGCGCCGTGCCATGGCGCCCTTGACGGCCGTCACCTCGTCG 

30 CGCCGGATCCCTTCGGCGAGGAGCAGTCCGGTCCACGCGCTGGTGGTGCCGGACCCCTCGTGGATGCCCAGCTTGGGGCG 
GGCCACGGTCTCGGCGGGCAGCAGGCCGGAGAGGGCCTGCCGCAACACCCACTTGTCGGTGCCCCGCCGGCGTTTGAGCC 
CGGGTTCGAGGGAGACCAGCGCGTCCAGGACCGCGCGGTCCCAGTACGGGTGGGTGGTCCACTTCCCGGCGATGCCCGCG 
AGGACGGGGGACATCTCGTTGAGGCCGTCGAAGCCCGCCATGTCGCCCGCGATCTCGTCGTCGAGGGACCAGAGCGAGGC 
CGTGCGCCGGTGCATACCGCCGAGCGGGATGTCGGCGCCGTACCCGGTGAGGATGCGGAGCGGCCCGGTGTCGAGCCGCC 

35 GGTAGAGGGCGACGAGCGGCAGCAGGTACTCCAGGACCGTGGGGTCGGTGATCTCCGCGGCGGCGACCGCCCAGGGCAGT 
TCCCTGACGAGTTCGGCCGAGTGGAGCCGGATCTCGCTGTGCGCGGTGCCCAGGTGGACGGCGACCGAGCGGGCCGCGTC 
GAACTCGTCGGACACCTCGGTGCCCATCGACACGGACCGTGTCCCGGGTGCCAGGGCCGCCGTGTGGGCGGCGACTCCCC 
CGGAGTCGATGCCGCCGGACAGGACGACGGTGGGGGCCGCCTCCCCGCCGCGCAGCCGGGTGCGGACCGCCGTGGCGAGG 
CGTTCGCCGACCAGGTCCACCGCCTCCCGTTCGCCGGGCAGCGCCCGGGAGAGCGGGGGTGTCCAGGTGCGGACCGCCCT 

40 GGCGGTGATGTCGGAGCCGCCGACTCCGTGCAGCAGGAGGGCGGTCCCGGCGGGGACCCGGCAGACGCCCGCCGCCCCCG 
GCGCGGTGTGGGTGCCGGACAGGCCCAGCGGCCGGCCCGGCTCGTGCGCCAGGGTCTTCGCCTCGGTGGCGGCGCTCAGC 
CCCGTCACGTCGGCGCGCAGCCACAGCGGTACCGAACCGGCGTGGTCGGTGGCCGCGACGGTCGCGCCGGTGGAGGCGTC 
GGTGAGCAGTGCGGCGAACCGTCCGTTCAGGAGCCGGAAGGCCCCGGGGCCCCAGCGCCGCCAGGCGGCCAGCAGCAGTT 
CGGCGTCGCCGAGGGCGGCAGAGGAGCCGCCGAGCGCTCCGGTCAGCTCGGCGCGGTTGTACAGCTCGCCCGCCAGGAGC 

45 AGCCGGACCTGGCCGTCGGCGACCAGGACGGGCGGACGGCCCAGGGTCACGGCCGTTCCGCTCCAGAGCGGGTACGCGGT 
GCCGTCGTGCACGGGGACATGGGTCCCGCGGACGGCGAAGCGGGGTGCGCTGCCGGGTTCGGAGTGACCGCCGGGGCCGC 
CGCCGGGGCGGCCCTCGGTGCCGATGCGCACCCGGAATCCGTACACGAGGTCGGGGCCGGGCAT 

SEQ ID NO: 14 orf4para 

50 CTACCCCCACCGCTGCCCGGCGAAGTCCACGGCGCTCTCGGCGTCCACCGCGTCCACCGCGTTCTCGGCGTTCTCGGCGT 
CGTCCGCCGCCGCCCCCGGTGGCAGGGGAGAGTCCACCGGTGCCGACGCGGGCGACGTGGTGGCGCGGGCGTACTGGTAG 
AGCAGTTCGGCCCCGATCTCCGCCGCCAGCAGGGAGGTGATCCCCGACGGGTCGTACGCCGGGGACACCTCGACCACGTC 
GAAGCCGACGGGCCTGAGCTGCCCGACCACGTCGAGCAGGGTCAGCACCTCGCGCGAGGACAGCCCGCCGGGGGCCGGTG 
TGCCGGTGCCCGGGGCGTACGCCGGGTCGACGACGTCGATGTCGACGGAGACGTACAGCGGCAGGCCGCCGACGGTGCGC 

55 CGGATCTGCTCGGCGATGCCGCGCGGTGAGCGCCGGGTGAAGTCGGCGGCGGTGACGATGCTGACGCCGTGCCCGCGCGC 
GTAGTCCAGGGAGTCGGGCCGCGGATTGTGGCCGCGGATGCCGACCTGGACCAGGCGCTCCGGGTCCACCAGGCCCTCTT 
CGATGGCCCAGCGGAAGGGGGTGCCGTGGTGGTAGGTGCCGCCGTAGACGGGTGGGTTGGTGTCGCTGTGCGCGTCCAGG 
TGCAGGACGGCGACCCGGCCGTGGCGGGCGTGCACGGCGCGCAGGGCGGCCAGGGAGAGCGAGTGGTCCCCGCCCAGCAT 
CAGGAACGCGTCGTTGCGTTCCAGGAGCCGGGTCAGGGCGACCGTCGCGGTGTCCATCGCCAGGTCCATCGAGAAGGGGC 

60 TGAGGTCGATGTCGCCCCCGTCGACCACGTCGATCCGGTCGAAGACCCCTGGGCCCCGGTCGATGCCGACGCCGTGGATC 
AGGCTGGACTCGTGCCGGATGGCGCGCGGCGCGAACCGCGCGCCGGGCCGGTAGCTGGTGCCTCCGTCGTACGGGGCGCC 
GACGACCACCACGTCATGGCCGATCGGGTCGGGCCGGTGGCGCAGCCGCATGAAGGTCGCCGGTTGGGCGTAGCGCGGGG 
AGACGGCGGTGGACAC 

65 SEQ ID NO: 1 5 orf6para 

ATGCGTGCCTCTTCGCCCAGAGGGTTCCGCGTGCACCACGGTCACGCCGGGATCAGGGGGTCCCACGCGGACCTCGCCGT 
CATCGCCTCCGACGTTCCCGCGGCGGTCGGCGCGGTGTTCACCCGTTCGCGGTTCGCCGCGCCGAGTGTGCTGCTCAGCC 
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GGGACGCGGTCGCCGACGGGATCGCCCGGGGCGTGGTGGTGCTGTCCGGCAACGCCAACGCCGGGACGGGCCCGCGGGGG 
TACGAGGACGCCGCGGAGGTGCGCCATCTGGTGGCCGGGATCGTCGACTGCGACGAGAGGGATGTGCTGATCGCCTCCAC 
GGGACCCGTCGGCGAGCGGTATCCGATGTCCCGTGTCCGGGCCCATCTGCGGGCGGTGCGCGGGCCCTTACCGGGTGCCG 
ACTTCGACGGCGCGGCGGCGGCCGTGCTGGGCACCGCGGGCGCCCGTCCCACGATCCGGCGGGCGCGGTGCGGCGACGCG 
5 ACGCTGATCGGTGTCGCCAAGGGCCCGGGTACGGGCCCGGCGGAGCAGGACGACCGGTCGACGCTGGCGTTCTTCTGCAC 
GGACGCCCAGGTGAGCCCCGTCGTCCTCGACGACATCTTCCGCCGGGTCGCGGACCGCGCCTTCCACGGGCTGGGCTTCG 
GCGCCGACGCCTCCACCGGCGACACGGCGGCCGTTCTCGCCAACGGGCTCGCGGGCCGGGTGGACCTCGTCGCGTTCGAA 
CAGGTCCTGGGCGCGCTGGCGCTGGACCTGGTCAGGGACGTCGTCCGGGACAGCGGCTGCGGCGGCGCCCTGGTCACGGT 
GCGGGTCACCGGGGCCCACGACACCGAGCAGGCCGGGCGCGTGGGCCGGGCGGTGGTCGACGCGCCGTCGCTGAGGGCCG 
10 CGGTGCACGGCCCGGCACCCGACTGGGCGCCGGTCGCCGCCGTGGCGGGTGGACACGGGGACGAAGGCCCCGGCCGGTCT 
CCCGGGCGGATCACGATCCGGGTCGGCGGCCGGGAGGTCTTCCCCGCCCCCCGCGACCGGGCCCGCCCGGACGCCGTCAC 
CGCGTATCCGCACGGCGGCGAGGTGACCGTCCATATCGACCTCGGTGTCCCGGGCCGGGCGCCCGGCGCGTTCACGGTCC 
ACGGCTGCGACCTCCTGGCGGGGTACCCGCGCCTCGGCGCCGGCCGGGCCGTCTGA 

15 SEQ ID NO: 16 para cluster 

CCATGGGAGCAGCATCGCAGTGCGCCTCCCCGGCCGCCATGCCGCTAGCTGGTAGTCCCCCTGCCGGGTGCCGACCGCCG 
GGGCGGTCCCGGGTGCGGCGGCCGGATCTAGTCGGTGTGCTCCGACGGTGCCTGCTGGGTGAGGGGCAGTGTCAGGCGGA 
TGGTGGTTCCCGCGCCGGGCGGGCTGTGCAGCCGCAGTTGGCCGCCGAGTGCCTCCACCCGGTCGGTGAGGCCGACGAGG 
CCCGAGCCCCGGCAGGGGGCGGCGCCACCGCGGCCGTCGTCGCGGATGCCGACGTGGAGCCGTCCGTCCCGGGTGGCCAC 
20 ATGGACGTCGACGACGGTGGCACCGGAGTGCTTGGCGGCGTTGGTCAGGGCCTCGGAGACGGCGTAGTACGCGGCGGTCT 
CGACCGGTTCGGGGTGGCGTTCCCCGGTCTGGATGTCGAGCCGGACCGGGATGGCGGAGCGCCGGGCCAGGGCCTTGAGC 
GCCGGGCGGAGTCCGCCCTCGGCGAGTACCGCCGGGTGGATGCCCCGGGCGACCTCCCGGAGTTCGTCGACGGCGGCGGC 
CAGCCCGTCGGTCACCTCGTCGAGCTGCCGGATCAGCTCGTCGGCGTCGAGCGGCACCGACAGTTGCACGGTGCGCACCC 
GCAGCGCCAGGGAGACCAGGCGCTGTTGGGGGCCGTCGTGCAGGTCGCGTTCGATACGGCGGCGGGCGGTGTCGGCGGCG 
25 GCGACGATCCGGGCCCGTGACGCGGTGAGGGCCGCCTGCGTCTCCGCGTTGGCGATGGCGGTGGCCACCAGTTCGGTGAA 
GCCGGCCAGCCGGTCCTCGGTGTCCGACGGCATCGGCTTGTCGTTCATCGACGCCACGCTGAGCGCGCCCCACAGTTGTC 
CGTCGACGTTGATCGGCATGCACACCGTGGCGCGGAATCCCCACTCCTTGCCGACGACGGAGGCCGGGCCCGAGGACACG 
GCCGCGTAGTCGTCGATCCGCGCCGGGCAGCCCGACTCGAACACCAGGGTGTGCACATTCCGGCCGCCGGGCGGTACCTG 
GATACCGGCGGGAAAATCACGGCCGGTCCTGGTCCAGGCGGCGACATACAGGGCGGTTCCGTTGGGCTCGTAACGGCCGA 
30 GGACCGCGAAGTCGGCCGAGAGGAGCTGTCCGGCCTCGGCGGCGACCGCGGCGAACACCTCCTTCGGCGGTGCCGCCCGC 
GCGACCAGGGTCGCCACGCGCCGCAGCGCCGCCTGCTCCTCGGCGGCCCCCCGCAGCTCCACACGTGCCTGGGTGTTCGC 
GATGGCGGTGGCCACGAGGTCGGTGAAACCGGCCAGCCGGTCCTCGGTGTCGGGCGGCAGCGGTTCCGCGGTCAGCGAGA 
TCGCCATCATCACGCCCCACAGCCGTCCCTCGACGTTGATCGGCACGCCGACGACCGAACCGAAGCCGCGCGCCCTGGCG 
AAGTCGGCGGGTGCCCCGGACGACTCGGCGGCGTCGTCGATCCGGGCCGGCCGCCCCGTCTCGGACACCAGCGTCACCAC 
35 GTTCCGGCCGTCGGGGTCCACCCGGGTGCCGATGGGGAAGAGCGGGCCGTGCAGACTTCTGGACCAGCCGCCGACGGCGC 
TCGCCATGCCGTCCGGATCGAGCCTGATGATTCCGGTCACATCGTTGCCGAGCAGTTCTCCGACTTCGGCGGCGACCGTC 
GCGAACATCTGTTCCGGTGGGGTGGCCCTGGCCACCAGGGTCGCCACCCGTCGGAGTGCCGCCCGCTCCTCGACGATCTG 
TTCGCACGACACGACCGCTGCCAGGCCCCCCTACCCGCCCGATGACGCCCGCATACCGGGTATCACGGCACATCAGCATG 
ACGTCCGCCGTGAACGCCCGTCAACGTGGCCCGCCGGAGTCGGGAACACGCGTCCGGAATCAGCCCCCGGAACGGCGGGA 
40 CCGTCTTCCTCCGTCCGGCGCGGGGCACTGCGCCGCGGCGGAATCCGCCCTGACCTCGGGAGTTTGCAGCTAGCTGGAAT 
CAGCGGTTCGGGTTGGTGGGAAGGGATGTTGGCCGCTGGCGGCGATGCGGAAGCCGATCGTTCCCAGTACTTCTGGGAAG 
TGCGTCGCGGAGAGTCGGTCCGCTTCCCCGAGTGGGCCGCGACGACGCTGCGGGTTCTCCACGGGGGAGAGATCCGCGAA 
CCGGCGAAGGAGCTGCCGTGTCGGACGTCTTCGCATCCGAGAAGAGTTCGCCCGGTGTCCGGACCCGCGCGGCAACGTCC 
CCACCGCGCTCTGTCATCAGCGCCGTCGGCGCCGTCAGCCACGCAGAGAAGATCGGATACGCAGTGTACGAGTGCAGCGA 
45 TGAGGTTCGTCACGACGTCCCCGGCCTGCCGGGTCCGTCACCGTCCATCACCGTCCTGGGCTGTCTGGGCGTACGCGCCG 
ACGGCCGGAAACTGGAGCTGGGCCCTCCGCGTCAGCGGGCCGTTTTCGCCCTGCTGCTCATCAACGCGGGCAGTGTGGTG 
CCGGTCGACTCGATCGTCTTCCGTATCTGGGGCAACTCACCACCGGGCGCGGTCACCGCGACGCTCCAGTCCTATGTGTC 
CCGGCTGCGGAAACTCCTGGCCGAGTGTGTGCTCCCGGACGGTTCGACACCCGAACTGCTGCACCAGCCGCCGGGCTACA 
CCCTCGCGCTCGGCACCGAGCACATCGACGCGAACCGTTTTGAGCAGGCCATCAGGACAGGGCGCCGGCTCTCGCGCGAG 
50 GAGCAGCACCAGGAGGCGCGGGCCGTGCTCTGCCAGGCCCTGCTGAGCTGGGGCGGGACACCGTACGAGGAGCTGAGCGC 
GTACGACTTCGCCGTCCAGGAGGCCAATCGGCTGGAGCAGCTCCGGCTGGGCGCCGTGGAGACATGGGCGCACTGCTGTC 
TGCGGCTGGGGCGGGACGAGGAGGTGATGGACCAGCTCAAGCCGGAGGTGCAGCGCAATCCGCTGCGGGAGCGGCTGATC 
GGGCAGCTCATGCAGGCGCAGTACCGGCTGGGGTGCCAGGCGGACGCGCTCAGGACGTACGAGGCGACGCGGCGGGCCCT 
GGCCGAGGAGCTGGGGACCGATCCGGGCAAGGAGCTGGCGGCGCTGCACGCGGCGATCCTGCGTCAGGACAACGGTCTGG 
55 ACCGCGTCGTCCCGGCGTCCGCGCCGCCGTCGGCGGGGGTCGGGCGGGGGGCCGTGACGGTGTCGGTCCCGGCACAGCGG 
TCGAGGCCGTTGACGCGGCCGGTGGCGGGGCGGGCGCGGGTCCCGGGGGCGATGACGGTGGCGGCGGGCGCGGGGGCGGC 
CCCCGCGTCCGCCTCCGGCTCCGTTTCCGCGTCCGTTTCCGGCTCCGGCTCCGGCTCCGGCTCCGCTCCTGCGTCGGTTC 
CCACCTTCTTTCCCGGCTCCGTTTCTGGCTCGGCGTCCGTTGCCGCGTCCGTAGCCGCGCCCGTTTCCGGCCATGTCTCC 
GGGCCCGGGTCCGCTTTCGGGTCCGTGGCGCTCCACCGGCCGCAGACCCTCCGGGGCGAGCCGGTCCACGGGGGCGCGCA 
60 GGGGATGCGCACCGGGCAGGTGTTCCCCACGCTGCCGCCGTTCGTCGGGCGCGGCGACGAGCTGCGCGGTCTGCTGGAGT 
CCGCGACGTCCGCGTTCCACACCTCGGGGCGGGTGGCGTTCGTCGTCGGCGAGGCGGGCAGCGGCAAGACCCGGCTCCTC 
TCCGAGTTGGAGCGCTCGGTTCCGGACAGTGTGCGCACCGTCTGGGCGTCCTGTTCGGAGAGTGAGGACCGGCCCGACTA 
CTGGCCGTGGACGACCGTGCTGCGGCATCTGTACGCGATGTGGCCGGAACGTATGCACGGATTCCCCGGTTGGCTGCGGC 
GCGCACTCGCGGAACTGCTTCCCGAGGTGGGCCCGGAGCCACAGGGGCCGCACTCCCCCGACGGGGGCGAGGAGAACAGC 
65 GGCAACGGGGACGGTGCGGGCGACGGGGACAGCACCCCGGCGCACACCCTCACGCTCGCGCCCGCTCTCGCGCCCCCGCG 
CTCCAGAGAGGCTCGTTTCACCCTGCACGACGCCGTGTGCCAGGCGCTTCTGCGCACGGTCCGCGAACCCGTGGTGATCA 
TGCTGGAGGACATGGAGCGGGCCGACGCCCCCTCGCTCGCCCTGCTGCGCCTCCTGGTGGAGCAACTGCGCACCGTCCCC 
CTGCTGCTCGTGGTCACCACGCGCACCTTCCGGCTCGCGCACGACGCCGAGCTGCGACGGGCCGCCGCCGTGATCCTCCA 
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GTCGACCGGCGCGCGCCGGGTCCTGCTGAACGCCCT 

AGGCCCCGGACACCCTCCTCGTACGGGCCCTGCACGAGCGCTCCGCCGGGAACCCGTACTTCCTCGTCCAGCTCCTCCGC 
TCGCTCCGGCAGGGGCTCGCCGCCGCCTGGGAGACGGAGATCCCGGACGAGCTGGCCGGGGTCGTGCTGCAACGGCTGTC 
GAGCGTGCCGCCCGCCGTGCGCCGGGTGCTCGACATCTGCGCGGTCGTGGAGCGCAGTTGCGAACGGCGTGTGATCGAGA 
5 CCGTGCTGCGCCATGAGGGAATCCCGCTGGAGAACGTCCGTACGGCGGTCCGCGGCGGTCTGCTGGAGGAAGACCCCGAC 
GACCCCGGGCGGCTGAGGTTCGTGCATCCGCTGGTCCGGGAGGCCGTCTGGGACGACCTGGAGAACACCCGTCGGCCCGT 
GTCCCGTTCCTCCGCGCTCGGGGCGCTGGCCACGGTCTGAGTCCCGGGCCCCGGGGTCCTCGGCGGCGGGCGGCGCTTGC 
GCGCTCCCCGACGCCGGGCTTGATCCCCCGGGGCAGCCGGACGCGCAGCCGGGTGCAAGGGGCGGTGCCGACACTGGGCG 
GGCGGCGGCCGTGGCCGGTCGCCGCCCCCCACGGCCCACCGAGGAGCCCCCATTGGACACGTACGCAGCGGATACGTACC 

10 CGCGGTCCGGCACCCACCCCGAGCCGCGTCCCGACGCACCTCCCCACGCGCGTCCCGGGACCCGTCCCGGCACCCGTTCC 
GAGCCGCGCCCGGACCCGGGCGCCGAGGCCGCGTGGCTGCTCGCGGCGGACCGCGCCCATATGTTCCACCCGGTCCTGCC 
CCGGGGCCGCGAGGACCGCACCGTTCTGGTCTCCGGCCGCGGCTGCACCGTACGGGACACCGAAGGGCGCACCTATCTCG 
ACGCCTCGTCGGTGCTCGGACTGACCCAGATCGGCCATGGACGTGAGGAGATCGCGCAGGCCGCCGCCGAGCAGATGCGG 
ACACTCGGTCACTTCCACACCTGGGGCACCATCAGCAACGACAAGGCCATCCGACTGGCCGCGCGCCTCACCGACCTGGC 

15 GCCCCAGGGTCTCCAGCGCGTCTACTTCACCAGCGGCGGCGGCGAGGGCGTCGAGATCGCCCTGCGCATGGCCCGTTACT 
TCCACCACCGCACCGGCAGCCCGGAGCGCACCTGGATCTTGTCGCGCCGCACCGCCTACCACGGCATCGGCTACGGCAGC 
GGTACGGTGTCGGGCTCGCCCGCCTACCAGGACGGGTTCGGCCCGGTGCTGCCCCATGTGCACCACCTCACGCCGCCCGA 
CCCGTACCACGCCGAGCTGTACGACGGCGAGGACGTCACGGAGTACTGCCTGCGCGAACTCGCCCGCACCATCGACGAGA 
TCGGCCCCGGGCGGATCGCCGCGATGATCGGGGAGCCGGTCATGGGCGCGGGCGGCGCCGTCGTCCCGCCGCCGGACTAC 

20 TGGCCGCGCGTCGCCGCGCTGCTGCGCTCCCACGGCATCCTGCTGATCCTGGACGAGGTCGTCACCGCGTTCGGCCGCAC 
GGGGACCTGGTTCGCGGCCGAGCACTTCGGGGTGACCCCCGATCTGCTGGTGACCGCGAAGGGCATCACCTCCGGGTATG 
TCCCGCACGGGGCGGTGCTCCTGACCGAGGAGGTCGCGGACGCCGTGAACGGGGAGACGGGGTTCCCGATCGGCTTCACC 
TATACCGGTCACCCCACGGCGTGCGCCGTCGCGCTCGCCAATCTCGACATCATCGAACGGGAAGGGCTGCTGGAGAACGC 
GGTGAAGGTGGGCGACCACCTCGCCGGGCGGCTGGCGGCCCTGCGCGGGCTGCCCGCCGTGGGGGACGTCCGGCAACTGG 

25 GCATGATGCTCGCCGTCGAGCTGGTGTCGGACAAGACGGCCCGCACCCCGCTGCCGGGCGGCACCCTCGGGGTCGTGGAC 
GCGCTGCGCGAGGACGCGGGCGTCATCGTCCGGGCCACGCCGCGCTCCCTGGTCCTCAATCCGGCGCTCGTGATGGACCG 
GGCCACGGCGGACGAGGTGGCGGACGGGCTGGACTCGGTGCTGCGGCGGCTGGCACCCGACGGGCGGATCGGCGCGGCCC 
CCCGGCGGGGGTGACGAGACCGCGGGCCGCCACCCGCGGGGGGCGCCGGGTCGGCACAGCGGCCGACCCGGCGCCTTCCC 
CGTTTCCCGGCGCCTTTTCCGTGCCCCGGCGCCGTTCCCGTGGCCCCTGCCCCTGCCCCTGCTCGGGCGCTCCTCCCTCC 

30 GCTGTGGCGCCGTTCCCGTTCCAGCGCGCTGTCGAGCCGCCGCCAAGCGCCCCGTGCCACGGTGGGAGACCGCCGCCCGA 
CGGGGCGCGCGGAGCCCGGCAAGCCGAAGGGAAGTCCCGTCCGATGCGTGCCTCTTCGCCCAGAGGGTTCCGCGTGCACC 
ACGGTCACGCCGGGATCAGGGGGTCCCACGCGGACCTCGCCGTCATCGCCTCCGACGTTCCCGCGGCGGTCGGCGCGGTG 
TTCACCCGTTCGCGGTTCGCCGCGCCGAGTGTGCTGCTCAGCCGGGACGCGGTCGCCGACGGGATCGCCCGGGGCGTGGT 
GGTGCTGTCCGGCAACGCCAACGCCGGGACGGGCCCGCGGGGGTACGAGGACGCCGCGGAGGTGCGCCATCTGGTGGCCG 

35 GGATCGTCGACTGCGACGAGAGGGATGTGCTGATCGCCTCCACGGGACCCGTCGGCGAGCGGTATCCGATGTCCCGTGTC 
CGGGCCCATCTGCGGGCGGTGCGCGGGCCCTTACCGGGTGCCGACTTCGACGGCGCGGCGGCGGCCGTGCTGGGCACCGC 
GGGCGCCCGTCCCACGATCCGGCGGGCGCGGTGCGGCGACGCGACGCTGATCGGTGTCGCCAAGGGCCCGGGTACGGGCC 
CGGCGGAGCAGGACGACCGGTCGACGCTGGCGTTCTTCTGCACGGACGCCCAGGTGAGCCCCGTCGTCCTCGACGACATC 
TTCCGCCGGGTCGCGGACCGCGCCTTCCACGGGCTGGGCTTCGGCGCCGACGCCTCCACCGGCGACACGGCGGCCGTTCT 

40 CGCCAACGGGCTCGCGGGCCGGGTGGACCTCGTCGCGTTCGAACAGGTCCTGGGCGCGCTGGCGCTGGACCTGGTCAGGG 
ACGTCGTCCGGGACAGCGGCTGCGGCGGCGCCCTGGTCACGGTGCGGGTCACCGGGGCCCACGACACCGAGCAGGCCGGG 
CGCGTGGGCCGGGCGGTGGTCGACGCGCCGTCGCTGAGGGCCGCGGTGCACGGCCCGGCACCCGACTGGGCGCCGGTCGC 
CGCCGTGGCGGGTGGACACGGGGACGAAGGCCCCGGCCGGTCTCCCGGGCGGATCACGATCCGGGTCGGCGGCCGGGAGG 
TCTTCCCCGCCCCCCGCGACCGGGCCCGCCCGGACGCCGTCACCGCGTATCCGCACGGCGGCGAGGTGACCGTCCATATC 

45 GACCTCGGTGTCCCGGGCCGGGCGCCCGGCGCGTTCACGGTCCACGGCTGCGACCTCCTGGCGGGGTACCCGCGCCTCGG 
CGCCGGCCGGGCCGTCTGAACGGGCGCTCCCGGGCGGACGGCGACCGCGAGGGCGCGGGAGCGCAGGGAACACGGGAGCG 
GGCCCGGTGGTCGATCGGCCACCGGGCCCGCTCCCGTCGTTCCGTCCGCTGTCCCCGGCCGCCCTACCCCCACCGCTGCC 
CGGCGAAGTCCACGGCGCTCTCGGCGTCCACCGCGTCCACCGCGTTCTCGGCGTTCTCGGCGTCGTCCGCCGCCGCCCCC 
GGTGGCAGGGGAGAGTCCACCGGTGCCGACGCGGGCGACGTGGTGGCGCGGGCGTACTGGTAGAGCAGTTCGGCCCCGAT 

50 CTCCGCCGCCAGCAGGGAGGTGATCCCCGACGGGTCGTACGCCGGGGACACCTCGACCACGTCGAAGCCGACGGGCCTGA 
GCTGCCCGACCACGTCGAGCAGGGTCAGCACCTCGCGCGAGGACAGCCCGCCGGGGGCCGGTGTGCCGGTGCCCGGGGCG 
TACGCCGGGTCGACGACGTCGATGTCGACGGAGACGTACAGCGGCAGGCCGCCGACGGTGCGCCGGATCTGCTCGGCGAT 
GCCGCGCGGTGAGCGCCGGGTGAAGTCGGCGGCGGTGACGATGCTGACGCCGTGCCCGCGCGCGTAGTCCAGGGAGTCGG 
GCCGCGGATTGTGGCCGCGGATGCCGACCTGGACCAGGCGCTCCGGGTCCACCAGGCCCTCTTCGATGGCCCAGCGGAAG 

55 GGGGTGCCGTGGTGGTAGGTGCCGCCGTAGACGGGTGGGTTGGTGTCGCTGTGCGCGTCCAGGTGCAGGACGGCGACCCG 
GCCGTGGCGGGCGTGCACGGCGCGCAGGGCGGCCAGGGAGAGCGAGTGGTCCCCGCCCAGCATCAGGAACGCGTCGTTGC 
GTTCCAGGAGCCGGGTCAGGGCGACCGTCGCGGTGTCCATCGCCAGGTCCATCGAGAAGGGGCTGAGGTCGATGTCGCCC 
CCGTCGACCACGTCGATCCGGTCGAAGACCCCTGGGCCCCGGTCGATGCCGACGCCGTGGATCAGGCTGGACTCGTGCCG 
GATGGCGCGCGGCGCGAACCGCGCGCCGGGCCGGTAGCTGGTGCCTCCGTCGTACGGGGCGCCGACGACCACCACGTCAT 

60 GGCCGATCGGGTCGGGCCGGTGGCGCAGCCGCATGAAGGTCGCCGGTTGGGCGTAGCGCGGGGAGACGGCGGTGGACACC 
CTGGCCGTTCCCCGCGCACCCGGCCCTGCTCCCGTTCCCGTACCGACGCCCGGCCACCCCGTGCGGGCTCCCGTTCCCGT 
GCCGACCCCCGTTCCCGAACGGGCTCCCGTTCCCGCGTGGAATCCCGTTCCCGCGCCCGCGGCGCCGTCCGGGCCGCGGC 
TGCCCCTCCCTCCGAGACCGCTCCTGCCGTTCCTGCGGCCGTTGCCGCTCTGCGGGCCGGTGCCCGCGCCCACGCCCGCT 
GCACCGTCCGCGCCGCCGCCGGTGCCGTTGCCGCCGCCGGTGCCGTTCTGGCCACCGGTGCCGTTCTGGCCGCTCATACG 

65 ACCACCCGGCCCTGGAGCCTGAGCCTGCGCACCGCGTCGACGGAGCGCCGCACCGTCTCGCCGAAGTCCACGTCCTCCGG 
CGGCACCGTGTCGATGACCACCGCGTCGTACAGGCGCCGTGCCATGGCGCCCTTGACGGCCGTCACCTCGTCGCGCCGGA 
TCCCTTCGGCGAGGAGCAGTCCGGTCCACGCGCTGGTGGTGCCGGACCCCTCGTGGATGCCCAGCTTGGGGCGGGCCACG 
GTCTCGGCGGGCAGCAGGCCGGAGAGGGCCTGCCGCAACACCCACTTGTCGGTGCCCCGCCGGCGTTTGAGCCCGGGTTC 
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GAGGGAGACCAGCGCGTCCAGGACCGCGCGGTCCCAGTACGGGTGGGTGGTCCACTTCCCGGCGATGCCCGCGAGGACGG 
GGGA(^TCTCGTTGAGGC(^TCGAAGCCCGCCATGTCGCCCGCGATCTCGTCGTCGAGGGACCAGAGCGAGGCCGTGCGC 
CGGTGCATACCGCCGAGCGGGATGTCGGCGCCGTACCCGGTGAGGATGCGGAGCGGCCCGGTGTCGAGCCGCCGGTAGAG 
GGCGACGAGCGGCAGCAGGTACTCCAGGACCGTGGGGTCGGTGATCTCCGCGGCGGCGACCGCCCAGGGCAGTTCCCTGA 
5 CGAGTTCGGCCGAGTGGAGCCGGATCTCGCTGTGCGCGGTGCCCAGGTGGACGGCGACCGAGCGGGCCGCGTCGAACTCG 
TCGGACACCTCGGTGCCCATCGACACGGACCGTGTCCCGGGTGCCAGGGCCGCCGTGTGGGCGGCGACTCCCCCGGAGTC 
GATGCCGCCGGACAGGACGACGGTGGGGGCCGCCTCCCCGCCGCGCAGCCGGGTGCGGACCGCCGTGGCGAGGCGTTCGC 
CGACCAGGTCCACCGCCTCCCGTTCGCCGGGCAGCGCCCGGGAGAGCGGGGGTGTCCAGGTGCGGACCGCCCTGGCGGTG 
ATGTCGGAGCCGCCGACTCCGTGCAGCAGGAGGGCGGTCCCGGCGGGGACCCGGCAGACGCCCGCCGCCCCCGGCGCGGT 

10 GTGGGTGCCGGACAGGCCCAGCGGCCGGCCCGGCTCGTGCGCCAGGGTCTTCGCCTCGGTGGCGGCGCTCAGCCCCGTCA 
CGTCGGCGCGCAGCCACAGCGGTACCGAACCGGCGTGGTCGGTGGCCGCGACGGTCGCGCCGGTGGAGGCGTCGGTGAGC 
AGTGCGGCGAACCGTCCGTTCAGGAGCCGGAAGGCCCCGGGGCCCCAGCGCCGCCAGGCGGCCAGCAGCAGTTCGGCGTC 
GCCGAGGGCGGCAGAGGAGCCGCCGAGCGCTCCGGTCAGCTCGGCGCGGTTGTACAGCTCGCCCGCCAGGAGCAGCCGGA 
CCTGGCCGTCGGCGACCAGGACGGGCGGACGGCCCAGGGTCACGGCCGTTCCGCTCCAGAGCGGGTACGCGGTGCCGTCG 

15 TGCACGGGGACATGGGTCCCGCGGACGGCGAAGCGGGGTGCGCTGCCGGGTTCGGAGTGACCGCCGGGGCCGCCGCCGGG 
GCGGCCCTCGGTGCCGATGCGCACCCGGAATCCGTACACGAGGTCGGGGCCGGGCATGGTGAACTCGTCCTCCACGGTGG 
TCAGATGGCCAGGGCGGCGAAACCGCCGGACTGGAAGTCGTAGGCCACCGGTACCTCGATCAGGAACGGGCGGCCGAGTC 
CGGCGCCCTTGGTGAGGGCGGCGAGCAGCGAGGTGCGGTCGGTGGCGCGGACGGCCTCGCAGCCGTTGGCCTCGGCGAGC 
TGGACGAAGTCGACGCTTCCGAAGCCGACGGCGGGGGCGTGGGAGCGCTGGTGTCCGAGGTTCTGGTACAGCTCGATCAG 

20 GCCGTTGCGGTCGTTGTTGACGACGACCATGACGATCGGCAGGCCCAGGCGCACGGCCGTCTCGATGTCGGCGCTGTTGG 
AGTGGAAGCCGCCGTCGCCCGCGATGAGGAAGACGGGCTCGCCGGGCCGGGCGATCTGGGCGGCCATGGCGGCGGGCAGT 
CCGTAGCCGAAGCTGGAGCAGCCCGCGGAGGTGAGGAATCCGTACGGCTGGTCGGACTTGGCGAAGAGCACGCCGTAGTG 
GCGGAAGAAGCCGATGTCGCTGACGAAGGTGCCGTTGTCGAGGACGGAGTTCATGCAGTCGATCACCTGGTGGACCCGCA 
TGCCGTCCTCGTACTCGGTGGGGTCGGCGAGGAATTCGGCGACGCGGGCGCGCAGGGCGCTGAGGTCGTGCCGGGTCTTG 

25 GGGGCGAGGCCCGAGGTCGCGTCGTCGAGCGCGGTGACGAATTCGGCGACGTTGGTGACGATGTCGATGTCGGCGCGGAA 
CAGCTCCGGGATCGGGTTGACCTCGGGGGCGACCCGGACCGTGGTCTTGGCCCGGCCCCGCGTCCACATGGAGGGGCGCA 
GGTCCTCGGCGTAGTCGTAGCCGATCGCCAGGAGGAGGTCGGCGGGGCCGAAGATCTCGTCGAGGGCCGGGTGGCCGAGA 
ATGCCGTCCATGTAGCCGCTGATGGCGCCGTAGTTGAGCGGGTGGTCGTGCGGCAGGACGCCCTTGGCGGTGTAGGTGGT 
GACGACGGGGATGTTCAGCCGCTCGGCGAGGGCGCGCAGGGCGTCGACGGCCCCGGCGCGGATGACGGCGCTACCGACGA 

30 CGAGGAGGGGGTTCTCGGCCTCGCGCACCAGCTCAGCGGCCTCGTCGAGGCGGGCGCGCCAGTCGGCGTCCAGGGCGTGG 
GTGGCGGTGGCCCGGACCAGGGGGGCGTCGGTGGGGGTGCCGTTCAGCTCGGCGCCGAGGAGGTCGACCGGCAGGCTGAT 
GAAGCTGGGACCCACGGGCTCGATCCGGCTGTTGAGGACGGCGCTGTCGACGAGGTTGACGATGTCCTCGCCGCGTTCGA 
GCTGGACGCTGAACTTGGTCAGCGGGCCCATCACGGCGGTGCTGTCCAGGCACTGGTGGGTGACGTTGGGGTAGCAGTCG 
TACGACTCGGACTGCGCGGCCAGCGCGATGACCGAGCTGCGGTCCAGGGCGGAGGTGGCGACGCCGGTGGCCAGGTTGGT 

35 CATGCCGGGGCCCAGGGTCGCGAAGCACGCCTGGGGGCGGTTGGTGATCCGGGCGAGGACGTCCGCCATCACCCCGGCGG 
TGAACTCGTGCCGGGTCAGGACGAAGTCGAGTCCTTCGACCTCGTCGAAGAGAATGGCGGACGCCTCCCGGCCGACGACG 
CCGAATACATGGTCGACACCGTACTGGTGAAGACGTTCCAGCATGGCTTTCGCGGTCGTGGTGGCCATGGAGATCTCCTT 
CGCATCGGACGGGCGCCGGGATGGCGCCCCGGAAAACGCGGCACCGGGCGGTGCGCACCGGGTGGCGCACACCGTGGGTG 
GTGGCGTTGCCACTGTGCGGATCGCCTCTTGGCGGCGGTCGGACGCCCGGCTTGGACAGAATGGGCAAGGCGCGTTCAAG 

40 GCATGGCGTCCATCGTCCTCGTGGCGCTTTTCGTGAAATCCGTCCGGCGCCGACGGTCTCCATCCGATTCCGTCCCCTTC 
CGTCCACCGATCCGAGGAGAATCCATGGATGTCCTGGCCGCGTTGGAGCGCAAGCCCAGCCTGAATCTTTTCCCCATCGA 
GAACCGGCTGTCGCCGCGCGCCAGTGCCGCGCTGGCCACCGACGCCGTCAACCGCTATCCGTACTCCGAGACCCCGGTGG 
CCGTCTACGGCGATGTCACGGGGCTGGCCGAGGTGTACGCGTACTGCGAGGACCTGGCCAAGCGCTTCTTCGGGGCGCGC 
CACGCCGGTGTGCAGTTCCTGTCCGGTCTGCACACCATGCACACCGTGCTGACCGCCCTGACCCCGCCCGGCGGGCGCGT 

45 CCTGGTCCTCGCGCCGGAGGACGGCGGCCACTACGCCACGGTGACGATCTGCCGGGGCTTCGGCTACGAGGTCGAGTTCT 
TACCTTCGACCGCCGGACACCTGGAGATCGACT 

SEQ ED NO: 17 cvm cluster 

GGTACCGGCATCCGACCCAGGCCCCGGGCGCAGGACCCGGAGGCAGGCACCGGCACACCC 
50 CGGCCGGGCGGCCCGGCTCCCGGCGGTCGGTGTCCGGCGACCCGCAATCGGCAGCCGCCC 
CAGGCCCGGGACAGGAGCCCGGCTCAAGGCACCGGCCCTGCGCACCCGCTGAGGCGGCAG 
GTTCCTGACAGCCGGCATCCGCCAGTCGGCGCGGGGCAGCCGCCCCAGGCGCCCGGCCCG 
GCACACCCGTGCGAGCGCCCGGCTCCCGGCGGTCGGTGCCCCGGAGGCGGCGACCGGCAG 
CCGGACACGGCCCCGCTCGGGGCGCGGCCCAGGGCACAGGCCCTGGGCACCCGCTCGGAC 
55 GCCCGTTCGGACAGCAGGCCCGTGGGAAGCCGCCGGTCAGGCCCGCAGGCAGCCACCGGT 
CGGCGGGCGGATCAGGTGTTGGCGGGGGACTCGTCCGGGAAGATCTTTGTGACGACGGTC 
CCGTCCTCGGTCAGATAGCCGTGCAGCATCCCGGGGCTGCTGTGCGGCGCGTCGAAGTCG 

CCCCGGGGGTCGAGGGCGATCACGCCGCCCTGCCCGCCGAGCCGGGGCAGGCGCTTGACG 
ATCACCTCGTAAGCGGCGGACGCCACGCCGAGCCCCTTGAACTCGATCAGATGGGAGAGG 

60 GTCGAGGTCGCCGCGCCCCGGATGAACACCTCACCGGCGCCGGTGGCGCTCGCGGCGACG 
GTCCGGTTGTCGGCGTAGGTCCCGGCCCCGATCAGCGGGGAGTCGCCGATCCGGCCGGGG 
AGCTTGTTGGTGAGCCCGCCGGTGGAGGTGGCCGCCGCGAGATCGCCGCGCCGGTCGAGG 
GCCACCGCGCCCACCGTCCCCGTCGACTGCGCGTCGGCCAGTGCCTCCGGGGCCCTCCGG 
GCGGCGGGATCGCCCGCCTCGGTCTCCTTCGCGCGCAGCAGCGCGTCCCAGCGGGCCTGG 

65 GTCCAGTAGTAGTCCTGGGTGACGGTGCGCAGCCCGTGCCGGGCGCCGAAGTCGTCGGCG 
CCCTCGCCGGAGAGGAGGACGTGCTTCGACTTCTCCAGCACCAGCCGGGCGCCCTCGACC 
GGGTTGCGCAGGGAGGTGACCCCGGCGACCGCTCCCGCCTTCAGATCGGAGCCCCGCATC 
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ACGGAGGCGTCCAGCTCATGCCCGGCGTCGGCGGTGAAGACGGCGCCCTTGCCCGCGTTG 
AACAGCGGGTTGTCCTCCAGTTCGCGGACGGCGGCCTCGACCGCGTCCAGGCTGTCCCCG 
CCGCGCGCGAGCACCCGCTGTCCGGCGCGGAGCGCTGCGGCGAGCCCGTCCCGGTACGCC 
TTCTCCCGTTCCGGGCCGGTCGTCTCCCGGTCCAGGGCGGCTCCGGCCCCGCCGTGGACG 
5 GCGATGACCACGTCACGGGCGTCCGGCCGGGGCTTCCCCGGCGCGCTCCCCCGTTCCTTC 
TTCTCCTCCCGCGCCTGCTGCTCCTGCTTCTGTTGCGTCGTGTGGGCCGCCGCGGTGGGT 
CCATGGCCGCCCGAGGCCCCGGGTACGACGATGAGCGTGGTCGTCAGCACCGCGGCGGCG 
AGCAGGGAGGACGCCAGCCAGGCGGTGGCGGGGCGGTGGGGCATCGGGCACTCCTCGGGA 
CGGGGGTGAGAGACGCTCCGGCCGACTGTACTGACATGCCCATGCCCCCTCTAGTGCCCC 
10 GGAGCCGCCTTCCGCCCTCCCCGCCGCCCGGCGGCGCCCGCCCGGCGCGCTCAGTCCAGG 
GCCAGGTCCTCCGGGGCGGAGCGGGCGAGTCCGGCGAGTGTGCCGAGCGCCCGGGTCAGT 
TCGTCCGCCGACGGCGACGCCAGGCCCAGCCGGACCGCGTGCGGTGTACGGCCCTGCCCG 
GCGCAGAACGCGGCGGCGGGCGTCACCCCGATCCCGTGCCGCGCGGCGGCGGCGACGAAG 
GTGTCGGCGCGCCAGGGGCGGGGCAGCACCCACCAGCAGTGGTACGAGCCGGGGTCGCCC 
15 GACACGGCGAAGCCGTCGAGCGCGCGCCGGGCGATCTCCTGCCGTACGCCCGCGTCCCGC 
CGCTTGGCGCGTACCAGCGCGTCGACCGTGCCGTCGGTCTGCCAGCGGACCGCCGCCTCC 
AGCGCGAACCGCGCGGGGCCGAGACCGCCGGAGCGCAGCGCGGCGCCGACCGCTCCGTCG 
AGCCCCGGGGGCACCACCGCGAACCCCAGGGTCAGCCCGGGGGCGAGCCGCTTGGAGAGG 
CTGTCGACGAGCACCGTCCGCCCGGGGGCGACCGCCGCGAGCGGAGCCGTGCCCTCCCGC 
20 AGGAAGCCCCAGACGGCGTCCTCGACCGCGGGAAGGTCCAGCCGCTCCAGGACCGCGGCG 
AGCTGGGCGAGACGCCCGTCCGACAGGGTGAGGGAGAGCGGGTTGTGCAGGGTGGGCTGG 
ACATAGACCGCCCGGAGCGGAGCGCTCCGGTTGGCCTCGTCCAGCGCCTCCGGAATCACC 
CCGTCCGCGTCCATGGCGAGGGGGACGAGCGTGATGCCGAGCCGGGCCGCGATCGCCTTG 
ACCACGGGGTAGGTCAGCTCCTCGACCCCCAGTCGGCCCCCCGGCGGCACCAGCGCGCCG 
25 AGCACGGCGGAGAGTGCCTGCCGACCGTTGCCCGCGAACAGCACCCGCCGGGGGTCCGGC 
CGCCAGCCGCCCCGGGCGAGCAGCCCGGCGGCGGCCTCGCGCGCCTCGGGGGTCCCGGCG 
GCACCGGCCGGCCGGAGCACGGACTCCAGGACATCGGGCCGCAGCAGCCCGCCGAGCCCG 
GTGGCCAGCAGCGCGGCCTGCTCGGGGACGACGGGGTGGTTCAGCTCCAGGTCGATCCGG 
CTTCCGGCGGGCTCGGAGAGCGCGGGGCCGACGCCCGCCCGCGCCGCGCGGACATAGGTG 
30 CCGCGCCCCACCTCGCCGACGGTGAGCCCTCTGCGGGCCAGCTCCCGGTAGACCCGGGCG 
GCGGTGGAGTCGGCGATGCCGCACCCGCGGGCGAACTCCCGCTGCGGCGGAAGCCGGTCC 
CCGGGGCGCAGCCCGCCCGTCCTGATCTCCTCGGCGACCGCGTCGGCCACCTGCCGGTAG 
TCCTTCATCTCCCGTACCTCCCCTGTCCGGTGGACCGCTTCCCGCCCGGCCCCGCCGACC 
GTGAAACGGAAGCACCCCGTTCCGGAGCTCGAGCTCCCCGTCCGGAAGCTCCCCGTCCGG 
35 AAGCTCCCCGTTCCAGAATTGCACCGAGAGCAATATTCCCTATTGCACCGATCAAAACAC 
CGATCTACGCTCGGAATTGCCTCACACAGACCGTCGACGCATCTGCCGCACACCGGTACT 
GACGCCCCGTCGGACCGCACCCGCGCGGAGCCGTCGCCCCGCCCGCCCCGTTCGCGCACA 
GGAGAGAGAAGGAGATGGTGGAGACCAGCGCACTCGCCGGTGTGGTGATGGTCGCCCTCG 
GAATGGTCCTCACCCCGGGACCGAACATGATCTATCTCGTCTCCCGCAGCATCACCCAGG 
40 GCCGACGTGCGGGGATCATCTCGCTGGGCGGTGTGGCCCTCGGTTTTCTGGTCTATCTGC 
TCGCCGCGAATCTCGGCCTGTCGGTGATCTTCGTCGCCGTGCCGGAGTTGTATGTCGCGG 
TCAAACTGGCCGGTGCGGCCTATCTGGCATATCTCGCCTGGAACGCCCTGCGGCCCGGTG 
GCGTGAATGTGTTCTCCCCCGAGGAGGTTCCGCACGACTCCCCGAGCAGGCTGTTCACCA 
TGGGGCTGATGACGAACATCCTCAACCCCAAGATCGCCGTCATGTATCTCGCACTCATCC 
45 CGCAGTTCGTCGACCCGAACGCGGACCGTGTCCTGTTCCAGGGGCTGATTCTCGGCGGTC 
TCCAGATCGCGGTGAGCGTCGCGGTCAATCTCGCGATCGTGCTGGCGGCCGGAGCCATCG 
CCGCCTTTCTCGGCCGCCACCCCTTCTGGCTCAGGGTTCAGCGCCGCGTGATGGGCGCGG 
CGCTCGGTACGCTCGCGGTCTCCCTGGCCCTCGACACCTCCGCCCCCGCCGCACCCGTCT 
CCTGAGGCCGCCGGACCGGGAGCCGACGCGAAGGCACCCCTGGGCAACCGTTCGGAGAGC 
50 TTATCCGTTACCCCATGAATCCCGATATAAGTGCATTGGCCACTTACCCATGCATGGAAC 
AGGCCAACCTGACCAAAAAATGAGCCCTCCCCACCCGGAATAGATGCTTCCCAGTGTGAA 
GAAATTTCATAGCGGGAGCGTCTGCCGAACAGGACGGCCCATACGCCGCAAGGCAGAACG 
GACATCGCCGCCCGCCCGGGTCCAGAAAATTCGGAGGACACATCGGACGACCGTCTCCGC 
ATCGGCGTCAACTCCCGATTACAGAGAATATTGAGTACGTATCAACCGGGCCTTGATCTA 
55 CTCAGCCTCCATTGTTCTCTCCAGTCGGGATGTGCAATGAAGTACGACATAACCCCACCA 
TCCGGCCTTCGGTTCGACCTCCTCGGCCCGTTGACCGTGACCGCCGGCGAGCAACCCGTG 
GACCTGGGCGCGCCACGGCAGCGCGCCCTGCTCGCCCTGCTGCTCATCGATGTCGGCAAC 
GTGGTCCCGCTGCCGGTCATGACCGCGTCGATCTGGGGGGCCGACCCACCGTCCCGGGTC 
CGGGGGACGCTCCAGGCTTATGTGTCCCGACTGCGGAAACTCCTGCACCGCCATGACCGT 
60 TCCCTTCGCCTTGTCCACCAGCTCCAGGGGTATCTCCTCGAAGTGGATTCGGCGAAGGTG 
GACGCCGTGGTTTTCGAGACACGTGTCAGGGAGTGCCGGGAATTGAGCAGGGCCCGGAAC 
CCCGAGGCCACCCGGGCCGTGGCCTGGTCCGCCCTGGAGATGTGGAAGGGCACACCCATG 
GGCGAGCTGCATGATTATGAATTTGTGGCGGCGGAGGCCGACCGGCTGGAAGGAATCCGG 
TTACGCGCGCTGGAGACCTGGTCCCAGGCGTGTCTCGATCTCCAGCACTATGAAGAGGTT 
65 GCATTTCAGCTCGGCGAGGAGATCCACCGCAATCCGGAACTGGAACGGCTGGGCGGTCTC 
TTCATGCGGGCCCAGTATCATTCCGGACGGTCGGCGGAAGCCCTGTTGACGTATGAACGT 
ATGCGTACCGCGGTGGCGGAGAATCTGGGGGCCGATATCAGTCCGGAGCTCCAGGAACTC 
CATGGAAAGATTCTGCGCCAGGAACTCACGGAGACACCCGCCGCGCGATCGACGGCCTCC 
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CTCACACGGGCGGC6GGCCCGCACGGGCCCCCGCCCCTGGCCGAAACCGGCACCCCCGCC 

GCACCCGCGGACATGGCCGAAACCACGGTGGCGGAGGAAAGCGCCGCGCCCCCCGCCCCG 

GCGGCGCCCGGGACCCCGCCCCCCATGCCGTCCCCCGTACCGCTCCCCCATCCGTCAGGG 

GCCGTCCCGCCGGTCACCCCGGTGCCTCCCCCGGTCCCCCGCTCGGCCCTCCGTTCAGCG 

GCACCCGCCGAGACCGAGGACCCGGAACCGGCGCCGCCCCCTCCCCCTCCGCCGGGCGGC 

CGACTCATCGGCCGCCGCGCCGAACTGCGCAGGCTGCGGCTGCTGCTGACGAAGACCCGC 

GCGGGCCACGGCCATGTCCTGCTGGTCTGCGGCGAACAGGGCATCGGGAAGACCCGGCTC 

CTGGAGCACACCGAGCACACCCTGGCCGCGGGCGCGTTCCGGGTGGTCCGTTCGCACTGC 

GTCGCCACCCTCCCGGCACCGGGCTACTGGCCCTGGGAGCACCTCGTACGCCAGCTCGAC 

CCGGACAGCGGCCTCGGTGACGACGGCGACGCCGACCCCGTCGCCCAGGCCGAGTGGCTG 

CCGGAACACCACCTCACCCACCAGATGCGGATCTGCCGGACGGTGCTCGCCGCGGCGCGG 

CGGACCCCGCTCCTGTTGATCCTGGAGGATCTGCACCTCGCCCACGCGCCGGTCCTGGAT 

GTGCTCCAGCTCCTGGTCAAACAGATCGGCCAGGCCCCCGTCATGGTCGTCGCCACCCTG 

CGCGAGCACGATCTCGCCCGGGACCCCGCCGTCCGCCGGGCCGTGGGCCGCATCCTCCAG 

GCGGGCAACACCGGCACCCTCCGGCTGGACGGGCTCACCGAGGAGCAGAGCCGGGAGCTG 

ATCGTCTCGGTCGCGGGGGCCCCGTTCGCGCCCCATGACGCCCAACGGCTCCAGCGCGCC 

TCGGGCGGCAACCCGTTTCTGCTGCTCAGCATGGTCACAGGGGAGGACGGCACCCAGGAG 

TGGGCACGGCCGTGCGTCCCGTTCGAGGTGCGCGAGGTGCTGCACGAGCGGCTGAGCGAA 

TGCTCCCCGTCCACCCAGGACGTGCTCACGCTCTGCGCCGTGCTCGGCATGAGCGTGCGC 

CGACCGCTGCTCACCGACATCATGTCCACGCTCGACATCCCGCACACCGCGCTCGACGAC 

GCGCTCGGCACGGGGCTGCTGCGCCACGACCGGAACACCGACGGAATGGTCCACTTCGCC 

CATGGGCTGACCCGGGACTTCCTGCTCGACGACACCCCGCCGGTCACCCGCGCCCGCTGG 

CACCACCGGGTCGCCGCCACCCTCGCCCTGCGCTTCCAGCAGGGCGACGACCACGCCGAG 

ATCCGCCGCCACTGTCTGGCCGCGGCCCGTCTGCTCGGCGCCCGCGCGGGGGTGCGCCCC 

CTGCTGGCGCTGGCCGACCGGGAGCAGTCCCGCTTCTCCCACGCGGAGGCGCTGCGCTGG 

CTGGAGAGCGCGGTCGCGGTCGTCGCGGCGCTGCCCCGGGACCAGCCGGTGTCCGCCGTC 

GAACTCCAGTTGCGCAAACGGATGATGGCGCTGCACGCGCTGATGGACGGCTATGGATCG 

GCCCGCGTCGAGACGTTCCTCTCCCAGGTCACCCAGTGGGAACACGTCTTCGACAACACC 

CAGCCCACCGGGCTGCTGCACGTCCAGGCGCTGAGCGCGCTCACCACGGGCCGCCATGAG 

CAGGCGGCGGAGCTGGCCGGGCTGCTGCACGAGCTGGCCGACCACGGCGGCGGACCGGAG 

GCCCGGTCGGCGGCCTGCTATGTGGACGGCGTCACCCTGTATGTGGGCGGACGGGTCGAC 

GAAGCCCTCGCCGCGCTCGCCCAGGGCACCGAGATCACGGACGCCCTCCTGGCCGGACAC 

CGCAGGACCGCCGCCCCGCACGGCGGCGGGCACCTCCAGGACCGGCGTATCGACTTCCGC 

GCCTATCTGGCGCTCGGCCACTGTCTCAGCGGCGACCGGATTCAGACCCAGCGCTACCGG 

ACGGAACTCCTCCACCTCACCCAGTCGGAACGGTACGACCGGCCGTGGGACCGGGCCTTC 

GCCCGCTATGTGGACGCGCTCATCGCCGTCACGGAGTGCGATGTCCAGGGGGTGTGGCTG 

GCCGCGCGGGCGGGGCTCGACCTCGCCGCCCGCTGCCAGCTCCCGTTCTGGCAGCGGATG 

CTCGCCGTCCCCCTCGGCTGGGCCGAGGTCCACCAGGGGGCGCACGACAAGGGGCTGGCC 

CGGATGCGGGAGGCGCTGCACGAGGCGGCCCGGCACCGGACCCTGCTGCGCCGTACGCTC 

CACCTCGGCCTGCTCGCCGACGCCCTCCAGTACACGGGCGCCCGGGAACAGGCCCGGCGC 

ACGATGTCCTCCGCCGTACGGGAGATCGAGCGCCGCGGCGAGTACTTCTGTCTCCGGCCG 

CAGTGGCCCTGGGCCCGGCTCCTCCACAGCCACGGCACCTCCGCCGCGGCGGAGCACCGG 

GTCGTCCACGGCAGGCACTGACCCGGGGCCGGCCGGAGCCGGGCCCGTACGGTACGGGTC 

CGGCTCCGGACCCGGCGGCCCGGAGCCGGGCGGGGCGGGGCGGCCCGACGGTTCCGGGGC 

CGGCGGTTGTGGGAGGGGGCGGCCCCCGATCGCTCAGACCGGGCAGACGGCGGACCGCCG 

CCCCGCCCGGCCCGAGCCGCCGCCCCCGGCCCAGTGCCCGTAGTCGCCCCGCAGGAAGAC 

CAGGGGCGAACCCTCGCGGATCACCCCGAGGTCGCGCACCGCCCCGGTGACGAACCAGTG 

GTCGCCCGCCTCCGTCTCCCCCGCCACCTCGCAGTCGAACCACGCGAGCGCGTCGAGCAG 

GACGGGGGAGCCGGTGGCCGTCGTCCGGTACGGCACCTCCCAGCGCCCCGGATCGCCCCC 

GGCGAAACTCCGGCAGACCGGGCCCTGATCCGCGCCGAGCACATTGACGCAGAAACGCCC 

GGCCGCCCGGAGCCGCGGCCAGGTCGTCGACGACCTGGCCGGGAGGAAACCCACCAGCAC 

CGGATCGAGCGACACCGAGGTGAACGTCCCCACCA CCATGGCGGGCGGCGGCTGCCCCGG 

AGCCTCGGCCGGACCGGTGACCAGGACCACCCCGGTGGGATAGTGGCCCGCCACCCGGCG 

CAGCAGACTCCCGGACACGGACCCGTGGGTGTGCGCGGAAAGGCCCGGAGGCCGGGTCAC 

AGCCACGGGTAACGCGCGGTGTCCTTGCCCGCGTAATCGGGGTCCAGATAGACGAAGGCC 

CGGTGGACGAGGAAGTCCCGCACCTCGTAGACCGTGCACCAGCGCCCGGCGGCCCACTCG 

GGGTCACCCGCCCGCCACGGCCCGTCCCGGTGCTCACCGTGGGTGGTGCCCTCCGCGGCG 

AGGAGTTCGGTCCCGGTCAGAATCCAGTTGACGGACCACAGATGGTGGGTGATCGAGCGG 

ATGGTGCCCCCGAGGTCGTCGAAGAGCCGGGCGATCTCGGACTTGCCCCGGGCCAGACCC 

CACTTGGGGAAGAAGAAGACCGCGTCCTCGGCGAAGTAGTCGATCGCGGGGGTGCCGTCG 

CTGCCGACGCCGCCGTTGTCGAACGCCTTGAAGTACGCGGTGATGACCGCCTTGCGCTGC 

T CGT CCGTCATACCGGCCGATGCCACGGACATGAAACGACCTCCAGAGATT CCGGGTGGC 

TGTGCTGGGGCTGCGGAAGGGGTGTCCCCCGCGAAGGACGGCGGACGCCGCGGACGCCGC 

GGCCGTCTCCCCGGCGGACGGGTCCCAGCGTCCTGGAGAGGGCTTGGCGGCGGCTTGACG 

CCGTGCTGTCCCGCGGCTTGCGGAACGCGAAGTACCGGCCAGCGTACGGGCGTTGCACCG 

GACGTGTACGCCGGTCGGGACCCCTCGTACCCCCGGAGCCGGCCGACCCCGGCGGCTCCG 

GGGGTACGGACGCGCCGGACCGGCCCGAGCGAGCCGGACGGGTCGGACGGTGCGCGTGGT 

TCCGGTGTGTCGGACAGCTCGGACGGACCGGACGGTGCGCGTGGTTCCGGTGTGTCGGAC 



31 



WO 2004/092389 PCT/EP2004/004001 



AGCTCGGACGGGTCGGACGGTGCGCGTGGTTCCGGCACGCCGGACGGGTCAGTTGCCGAT 
CATGGCGAGCAATGCCGGGGTGTACCGCTCCCCGGACACCGGGTGGGAGATCGCGGCCGT 
CACCTCCGCGAGGGACCGGTCGTCCAGCCGGATCGAGGCGGCGGCGAGATTGTCCGCGAG 
ATGGGCCGGGTTCGCGGTGCCCGGGATCGGGACGACGTCCTCGCCCCGGTGGTGCAGCCA 
5 GGCGAGCGCGAGCTGTGCCAGGGTCAGCCCCAGACCGTCCGCGACCGGGCGCAGCCGGTG 
CAGCAACGAGCGGTTGCGCGCGAGGGCCGGAGCGCTGAACCGGGGCTGGCCCCGGCGGAA 
GTCCTCGTCCCCCAGATCGTCGGTGGTGCGGATGGTGCCGGTGAGAAAACCCCGTCCCAG 
AGGGGCGTAAGCGACGATCCCGATCCCCAGCTCCCGGCAGACGGGCACCACCTCGTCCTC 
GATCCCGCGCGACCACAGGCTCCACTCGCTCTGCACCGCCGTCACCGGGTGCACCGCGTC 
10 CGCCCGGCGCAGCGTGGCCGCGGAGGGCTCGGAGAGACCGAGCCTGCGGACCTTGCCCTC 
GCGCACCAGCTCGGCCACCGCACCCACGGTCTCCTCGATCGGCACCGCCGGGTCCGTCCA 
GTGCTGGTAGTACAGGTCGATGCGGTCGGTGCCGAGACGACGCAGGGACCGTTCGCAGGC 
CGCGCGGACGTAGGACGGCTCGCCGCACAAGCCCTGGGAGGCGCCGTCGGACGAGCGCAC 
CATGCCGAACTTGGTGGCGATCAGCACCTCGTCCCGGCGGCCCGCGACCGCCCGTCCGAG 
15 CAGCTCCTCACCGGCGCCGAGCCCCTGGACGTCGGCGGTGTCCAGCAGGGTGACCCCGGC 
GTCGACGGCGGCGCGGATGGTGGCCGTCGCCCGGGCGCGGTCCGGGCGTCCGTAGAAGTC 
GGTGGTCGGCAGGCAGCCGAGCCCCTGGGCACTGACCGGAAGGTCCCGCAGGGCGCGGAC 
CGGCGGACGCGGAACCGCGGCGGACACGGAACCGGCCGGGGACTCGGGCGGAGAGCGGGA 
CATACGGAACCTCCACAGGCGGAGCCGGGAACGGGACGAGGGCGAGGACGGGACGGAACG 
20 AAGGAGAGGACGGGACGGACAGCACGGACGGGACGGACGGAACGGAGTCGGGAACCGGGG 
GGGGTGACCGGAACCGGGCCGTCCTTGGCCCTCCCCCGTCCTCCCCGCCATCCGCCGTTC 
TCCCCCGTTCCCTCTCCCGTCCTCCAGCCAACACCGCCGCCCTTTCCAAGCGCTTGACAC 
GGCACCGACAGCCGCCGCCGGGCGCCCGATGGGGACCCGTGCCCGCCGGTGAGCGGCGGT 
GAGCGCCGGTACGGGACCCCACGCGCCGCCGCCCGGGCGCCCGCCAGGGCCCGCGCGGCC 
25 ACCCCGGCCCGCCCCGGCCGGAGCGGCGATCCGGGCCGCTCGCTGCAAGAGGAACATCCA 
CAGCCGCACAAGGAGCGCTCCGCACAGTGGGCACCACGTCCGCCCCGTCCCCCACACCGT 
GGCCGGTCCCCACCGGACAGCACAGCACCGCACAGCACCACATCGCACGGCACAGCACAG 
CACCACCGGCACGAGGAACCAAGGAAAGGAACCACACCACCATGACCTCAGTGGACTGCA 
CCGCGTACGGCCCCGAGCTGCGCGCGCTCGCCGCCCGGCTGCCCCGGACCCCCCGGGCCG 
30 ACCTGTACGCCTTCCTGGACGCCGCGCACACAGCCGCCGCCTCGCTCCCCGGCGCCCTCG 
CCACCGCGCTGGACACCTTCAACGCCGAGGGCAGCGAGGACGGCCATCTGCTGCTGCGCG 
GCCTCCCGGTGGAGGCCGACGCCGACCTCCCCACCACCCCGAGCAGCACCCCGGCGCCCG 
AGGACCGCTCCCTGCTGACCATGGAGGCCATGCTCGGACTGGTGGGCCGCCGGCTCGGTC 
TGCACACGGGGTACCGGGAGCTGCGCTCGGGCACGGTCTACCACGACGTGTACCCGTCGC 
35 CCGGCGCGCACCACCTGTCCTCGGAGACCTCCGAGACGCTGCTGGAGTTCCACACGGAGA 
TGGCCTACCACCGGCTCCAGCCGAACTACGTCATGCTGGCCTGCTCCCGGGCCGACCACG 
AGCGCACGGCGGCCACACTCGTCGCCTCGGTCCGCAAGGCGCTGCCCCTGCTGGACGAGA 
GGACCCGGGCCCGGCTCCTCGACCGGAGGATGCCCTGCTGCGTGGATGTGGCCTTCCGCG 
GCGGGGTGGACGACCCGGGCGCCATCGCCCAGGTCAAACCGCTCTACGGGGACGCGGACG 
40 ATCCCTTCCTCGGGTACGACCGCGAGCTGCTGGCGCCGGAGGACCCCGCGGACAAGGAGG 
CCGTCGCCGCCCTGTCCAAGGCGCTCGACGAGGTCACGGAGGCGGTGTATCTGGAGCCCG 
GCGATCTGCTGATCGTCGACAACTTCCGCACCACGCACGCGCGGACGCCGTTCTCGCCCC 
GCTGGGACGGGAAGGACCGCTGGCTGCACCGCGTCTACATCCGCACCGACCGCAATGGAC 
AGCTCTCCGGCGGCGAGCGCGCGGGCGACGTCGTCGCCTTCACACCGCGCGGCTGAGCTC 
45 CCGGGTCCGACACCGCGCGGCTGAACCCACGGTCCGGGGCCCACGGTCCGGCACCGCGCG 
GCTGAGCCCCCGGGTCCGGCAGCGGGCGGCTGAACCCCCGCCCCGGGCCACCGCCCGACC 
GCCCCCGCGCACCGGACGCGCCCGCCTGTACGGCGGTCCCGCCCGGGCCCGTACACCTGA 
AGCGCCCGGCGGACCGCCGCCCCGCCGGGGGACGGACAGAGCCGGGTGCGGGAGGACGTC 
CTCCCGCACCCGGCTCCCACCGTTCCGCACCGACCGCACCCGACCGTGCCGCAGGCGCCA 
50 CCGGCACCGCACCGCCCGCGCCGGCAGCCACCACAGGCGCCACGCCGCCCGCACGGTGCC 
CGCGCTGCTCAGCCCCCGTCCACCGGGCTGTCCAGCAGCCGCCGCAGCGCGCCCCCGATG 
AACTCCCGGTCGGCGGCCGACCCCCCGGACCCCGCGAGATGCCCCCACACTCCCGGGATC 
ACCTCCAGCGAGGCATACGGCAGCAGATCGGCCACCCGCTTCTCGTCCTCGACGGCGAAA 
CACACGTCCAGGGCGCCCGGCAGCACCACGGCCCGCGCCGTGACGGAGGCCAGCGCCGCC 
55 TCGACGCTCCCCCCGGCCCCGGGTGTCGCCCCCACATCCGTGTTCTCCCAGGTGCGCACC 
ATGGTGAGCAGATCCGCGGCGCCGGGCCCGGAGAGGAAGACCTGCTCCCAGAAGCCGGTG 
AGGTACTCCTCGCGGGTGGCGAAACCCAGCTCCCGGTGGGCACGGCGGGCCCAGAAGGAA 
CGCGAGGTCCCCCACCCGGCGAACACCCGGCCCGCCGCCTTCCGCCCCCGCTCCCCGGCG 
TCGGCGCTGAGCGCCGCGGCCAGACCGGACAGCAGGACCAGGCTGTGCGGGCTGCTCACC 
60 GGCGCCCCGCAGATCGGGGCGATCCGGCGCACCATCCCCGGATGCGACACGGCCCACTGG 
TAGGCGTGGGCCGCGCCCATCGACCAGCCCGTGACCAGGGCCAGTTCCCGTACCCCCAGC 
TCCTCGGTGAGCAGCCGGTGCTGCGCCGCGACATTGTCCTGCGGAGTGATCAGCGGAAAG 
CGGGACCCCGACGGGTGGTTGCCGGGCGAGCTGGAGACCCCGTTGCCGAAGAGTCCGGCG 
GTGACGACGCAGTACCGCCGGGTGTCCAGCGGCAGCCCCGCACCGATCAGCCAGTCGTAC 
65 CCGGTGTGGTCCCGGCCGAAGAACGACGGACAGAGCACCACGTTCGTCCCGTCGGCGTTC 
GGCGTGCCGTACATGGCGTAACCGATCCGGGCGTCCCGCAGGACCTCCCCGTCCAGCAAC 
GGCAGTTCGTCGATCTCGAATATGCGGCATTCCACCGCTGACCTCCTTGTTCGATCCCCC 
CGGACAACAGGTCGGTCGTG GCCGGAGACTCAGAGCCAGTTGGGGGCGATCTCGGTGGCC 
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CACAGCTCCAGGCTGCGCAGCTGGACATCGTGCGGGATCAGCCCGGAGTACTGGCACTGG 
AGCAGATACTCCGGATCGTGCCGCTCCACCAGCTTCTCGATCATGCGGTTGATGTCGTCC 
GGGGTGCCGACCCACTCCAGCCCCCGGTCGACCAGGGTCTTGTAGTCCGAGCCGATCGGA 
CCCGTCTCGCCGGTCGCGCGCAGCGCCTCGGTGAAGCCCATGGGGCCGAACCAGTTCTCG 
5 AAGATGAAGCCGCCGCCGCGGGACGCCCAGTGGTGGGCCTCGCCGGAGTCCCGGGAGACC 
AGGACGTCCTTCATCACCCCGACCCGCTCGCCCCGCCGCAGGGTGCCGTGGCCCGCCGCC 
TCGGCCTCCTCCCGGTAGATGTCCATCAGCCGGGCGACGATCTGGTCGTCGGTGTTCATC 
AGGATCGGCACCACGCCCTCCCGGGCACAGAACCGGAACGTGTCCTCACTGAAGCTGAAC 
GGCTGGAAGACGGGCGGGTGGGGGCGCTGGTAGGGCTTGGGCGCGATGCCCACCTCGCGG 
10 ATGACGCCGTTCTCGTCGAGGCCCCGGCCGTAGCGGCGCACCGCCTCGTAGGGGAACTCC 
AGGTCCGGCACCGGGATCGTCCACTGCTCCCCGGAGTGGGTGAACGTCTCGGTCGTCCAC 
GCCTTCTTGATGATCTCCCAGTGCTCCTCGAAGAGGGCACGATTGCGCCGGTCCCGCTCC 
CCGGCGTCGGACAGGGTGCCGCCGACCCCGTACACCTGCCCCATGATGTCGGCCCAGCGC 
TTCTGGAACCCGCGCGCGATCCCGACGAAGGCGCGGCCCCGGGTCATGTGGTCGAGCATC 
15 GCCAGATCCTCGGCCAGCCGCAGCGGATTGTGCAGCGGCAGGACGTTGGCCATCTGGCCG 
ACCCGGATGTGCCGGGTCTGCATGCCGAGGTAGAGCCCCAGCATGATCGGGTTGTTGGAG 
ACCTCGAAACCCTCGGTGTGGAAGTGGTGCTCGGTGAAGGACAGTCCCCAGTAGCCGAGT 
TCGTCGGCCGCCTGCGCCTGCCGGGTGAGCTGCCGGAGCATGTTCTGGTAGTTCTGCGGA 
TTGACCCCCGCCATACCCCGCTGGACCTGCGCATGACTGCCGACCGTTGGCAGATAGAAG 
20 AGAATGGACTTCACCCTGGCTCCTCCGGTTCGCGGCGCCCTCCATTGACGTGCGCCGAAA 
GCGGCTCGACCGTCCCACTCCGCCCTTGAGTTCCGTCTGACGCCGCGCCAGTCGGCGGGC 
CGTCCGCCGGGGTGCCCGCCGGGGTCCGCACCCGCCGGACGGCACGGCGCGCACCGCGCG 
CGCGGCGCTTCGGGGCACCGGGCTCGACGGGGTGCTCAGCGGGACGTCCAACGGAAGGCA 
AGCCCCCGTACCCAGCCTGGTCAAGGCGCTCATCGCCATTCCCTGAGGAGGTCCCGCCTT 
25 GACCACAGCAATCTCCGCGCTCCCGACCGTGCCCGGCTCCGGACTCGAAGCACTGGACCG 
TGCCACCCTCATCCACCCCACCCTCTCCGGAAACACCGCGGAACGGATCGTGCTGACCTC 
GGGGTCCGGCAGCCGGGTCCGCGACACCGACGGCCGGGAGTACCTGGACGCGAGCGCCGT 
CCTCGGGGTGACCCAGGTGGGCCACGGCCGGGCCGAGCTGGCCCGGGTCGCGGCCGAGCA 
GATGGCCCGGCTGGAGTACTTCCACACCTGGGGGACGATCAGCAACGACCGGGCGGTGGA 
30 GCTGGCGGCACGGCTGGTGGGGCTGAGCCCGGAGCCGCTGACCCGCGTCTACTTCACCAG 
CGGCGGGGCCGAGGGCAACGAGATCGCCCTGCGGATGGCCCGGCTCTACCACCACCGGCG 
CGGGGAGTCCGCCCGTACCTGGATACTCTCCCGCCGGTCGGCCTACCACGGCGTCGGATA 
CGGCAGCGGCGGCGTCACCGGCTTCCCCGCCTACCACCAGGGCTTCGGCCCCTCCCTCCC 
GGACGTCGACTTCCTGACCCCGCCGCAGCCCTACCGCCGGGAGCTGTTCGCCGGTTCCGA 
35 CGTCACCGACTTCTGCCTCGCCGAACTGCGCGAGACCATCGACCGGATCGGCCCGGAGCG 
GATCGCGGCGATGATCGGCGAGCCGATC ATGGGCGCGGTCGGCGCCGCGGCCCCGCCCGC 
CGACTACTGGCCCCGGGTCGCCGAGCTGCTGCACTCCTACGGCATCCTGCTGATCTCCGA 
CGAGGTGATCACGGGGTACGGGCGCACCGGGCACTGGTTCGCCGCCGACCACTTCGGCGT 
GGTCCCGGACATCATGGTCACCGCCAAGGGCATCACCTCGGGGTATGTGCCGCACGGCGC 
40 CGTCCTGACCACCGAGGCCGTCGCCGACGAGGTCGTCGGCGACCAGGGCTTCCCGGCGGG 
CTTCACCTACAGCGGCCATGCCACGGCCTGCGCGGTGGCCCTGGCCAACCTGGACATCAT 
CGAGCGCGAGAATCTGCTCGACAACGCCAGCACCGTCGGCGCCTACCTGGGCAAACGCCT 
GGCCGAGCTGAGCGATCTGCCGATCGTCGGGGACGTCCGGCAGACCGGTCTGATGCTCGG 
TGTCGAACTGGTCGCCGACCGCGGAACCCGGGAGCCGCTGCCGGGCGCCGCCGTCGCCGA 
45 GGCCCTGCGCGAGCGGGCGGGCATCCTGCTGCGCGCCAACGGCAACGCCCTCATCGTCAA 
CCCCCCGCTGATCTTCACCCAGGAAGACGCCGACGAACTCGTGGCGGGCCTGCGCTCCGT 
ACTCGCCCGCACCAGGCCGGACGGCCGGGTGCTCTGACCCCTTTGGCCCTCCCCGGCCCC 
ACCGGGGCACCACCCCGCCGCACCCCGAGCGCAAAAAGACCCCTCTGCCTGCGTTTCCGC 
AGGTCAGAGGGGTCTGGTGCAGTGGAGCCTAGGGGAGTCGAACCCCTGACATCTGCCATG 
50 CAAAGACAGCGCTCTACCAACTGAGCTAAGGCCCCGAAGCGACAGAACGGCCCTGGACTG 
CTCCGTCCCGGCCACTGCCGCAGACCAGAGTACCGGGTGTTCCCGGTGATCCTCCAAAAC 
ATTGAGGTCTCCCGGTGGGCGACCACTCTCCGTj^AGATGCTCGACGTGGTTCGCAGCAGC 
GAAGCCCGCTTGGGGAAGCGATGGGGAGACGCGCATGGACGCCGCTCAGCAGGAGACGAC 
CGCAAGAGCCCGGGAGCTACAGCGAAGCTGGTACGGGGAGCCCCTGGGGGCCCTGTTCCG 
55 CAGGCTGATAGACGATCTGGGGCTGAACCAGGCGCGTCTCGCGGCGGTGCTGGGCCTCTC 
CGCCCCCATGCTCTCCCAGCTCATGAGCGGCCAGCGGGCCAAGATCGGCAACCCGGCCGT 
GGTCCAACGGGTCCAGGCGCTCCAGGAGTTGGCCGGACAGGTGGCCGACGGCAGCGTCAG 
CGCGGTGGAGGCCACCGACCGCATGGAGGAGATCAAGAAGTCGCAGGGAGGCTCCGTCCT 
GACCGCGAACAGCCAGACCACCAACAGCTCGGGGGCGCCGACCGTCCGCCGGGTCGTCCG 
60 GGAGATCCAGTCGCTGCTGCGGTCCGTGTCCGCCGCGGGGGACATCATCGACGCGGCGAA 
CTCCCTCGCCCCGACCCATCCGGAGCTGGCAGAGTTCCTGCGGGTGTACGGGGCCGGGCG 
CACCGCGGACGCCGTGGCGCACTACGAGTCCCACCAGAGCTGACGACCGAGGCCGGCCCC 
GGAACGGACCAGAGCCTCATGAGGGACGGGGAGCGGACGCGGCACCATGGGTGAGGTCTT 
CGCCGGCCGGTACGAGCTGGTCGACCCGATCGGACGCGGAGGGGTCGGCGCGGTCTGGCG 
65 CGCCTGGGACCACCGGCGCCGCCGCTATGTGGCGGCCAAGGTGCTCCAGCAGAGCGACGC 
GCACACCCTGCTGCGCTTCGTCCGCGAGCAGGCCCTGCGGATCGACCATCCCCATGTCCT 
GGCCCCGGCGAGCTGGGCCGCGGACGACGACAAAGTCCTCTTCACCATGGATCTCGTGGG 
CGGCGGATCACTCGCGCACGTGATCGGCGACTACGGCCCGCTCCCGCCGCGCTATGTGTG 
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CGCCCTGCTGGACCAACTCCTCTCCGGGCTCGCCGCGGTGCACGCCGAGGGCGTGGTGCA 
CCGCGACATCAAACCGGCGaACATCCTGATGGAGGCCACCGGGACGGGCCGCCCCCATCT 
GCGCCTGTCCGACTTCGGCATCTCCATGCGCAAGGGCGAGCCCCGGCTGACCGAGACCAA 
CTATGTCGTGGGTACGCCCGGTTACTTCGCCCCCGAGCAGGTCGAGGGCGCGGAGCCGGA 
5 CTTCCCCGCCGATCTCTTCGCCGTCGGCCTGGTCGCCCTCTATCTGCTGGAGGGTCAGAA 
ACCCGACACCAAGGCCCTGGTGGACTTCTTCACCGCCCATGGCACCCCCGGTGCTCCCCG 
GGGGATACCGGAGCCGCTGTGGCAGGTGCTCGCGGGGCTGATCCAGCCCGACCCCGCCGC 
CCGGTTCCGTACGGCGACGGGGGCCCGGAAGGCCCTCGCCGCCGCCGTGGAACTGCTTCC 
CGAGAGCGGCCCCGACGACGAACCGGTGGAGATATTCGACCAACTGGGCCCGCTGCCGCC 
10 GGGGTTCGGCCCCGGCGGCCCCGAGAACACGCCGCCCTCCGGTCTGCTGCGCTCGGCGGC 
CTCCGGTACC 

SEQ ID NO: 1 8 or£2par reverse complement 

ATGGCCACCACGACCGCGAAAGCCATGCTGGAACGTCTTCACCAGTACGGTGTCGACCATGTATTCGGCGTCGTCG 

15 GCCGGGAGGCGTCCGCCATTCTCTTCGACGAGGTCGAAGGACTCGACTTCGTCCTGACCCGGCACGAGTTCACCGC 
CGGGGTGATGGCGGACGTCCTCGCCCGGATCACCAACCGCCCCCAGGCGTGCTTCGCGACCCTGGGCCCCGGCATG 
ACCAACCTGGCCACCGGCGTCGCCACCTCCGCCCTGGACCGCAGCTCGGTCATCGCGCTGGCCGCGCAGTCCGAGT 
CGTACGACTGCTACCCCAACGTCACCCACCAGTGCCTGGACAGCACCGCCGTGATGGGCCCGCTGACCAAGTTCAG 
CGTCCAGCTCGAACGCGGCGAGGACATCGTCAACCTCGTCGACAGCGCCGTCCTCAACAGCCGGATCGAGCCCGTG 

20 GGTCCCAGCTTCATCAGCCTGCCGGTCGACCTCCTCGGCGCCGAGCTGAACGGCACCCCCACCGACGCCCCCCTGG 
TCCGGGCCACCGCCACCCACGCCCTGGACGCCGACTGGCGCGCCCGCCTCGACGAGGCCGCTGAGCTGGTGCGCGA 
GGCCGAGAACCCCCTCCTCGTCGTCGGTAGCGCCGTCATCCGCGCCGGGGCCGTCGACGCCCTGCGCGCCCTCGCC 
GAGCGGCTGAACATCCCCGTCGTCACCACCTACACCGCCAAGGGCGTCCTGCCGCACGACCACCCGCTCAACTACG 
GCGCCATCAGCGGCTACATGGACGGCATTCTCGGCCACCCGGCCCTCGACGAGATCTTCGGCCCCGCCGACCTCCT 

25 CCTGGCGATCGGCTACGACTACGCCGAGGACCTGCGCCCCTCCATGTGGACGCGGGGCCGGGCCAAGACCACGGTC 
CGGGTCGCCCCCGAGGTCAACCCGATCCCGGAGCTGTTCCGCGCCGACATCGACATCGTCACCAACGTCGCCGAAT 
TCGTCACCGCGCTCGACGACGCGACCTCGGGCCTCGCCCCCAAGACCCGGCACGACCTCAGCGCCCTGCGCGCCCG 
CGTCGCCGAATTCCTCGCCGACCCCACCGAGTACGAGGACGGCATGCGGGTCCACCAGGTGATCGACTGCATGAAC 
TCCGTCCTCGACAACGGCACCTTCGTCAGCGACATCGGCTTCTTCCGCCACTACGGCGTGCTCTTCGCCAAGTCCG 

30 ACCAGCCGTACGGATTCCTCACCTCCGCGGGCTGCTCCAGCTTCGGCTACGGACTGCCCGCCGCCATGGCCGCCCA 
GATCGCCCGGCCCGGCGAGCCCGTCTTCCTCATCGCGGGCGACGGCGGCTTCCACTCCAACAGCGCCGACATCGAG 
ACGGCCGTGCGCCTGGGCCTGCCGATCGTCATGGTCGTCGTCAACAACGACCGCAACGGCCTGATCGAGCTGTACC 
AGAACCTCGGACACCAGCGCTCCCACGCCCCCGCCGTCGGCTTCGGAAGCGTCGACTTCGTCCAGCTCGCCGAGGC 
CAACGGCTGCGAGGCCGTCCGCGCCACCGACCGCACCTCGCTGCTCGCCGCCCTCACCAAGGGCGCCGGACTCGGC 

35 CGCCCGTTCCTGATCGAGGTACCGGTGGCCTACGACTTCCAGTCCGGCGGTTTCGCCGCCCTGGCCATCTGA 

SEQ ID NO: 19 orf3par reverse complement 

ATGCCCGGCCCCGACCTCGTGTACGGATTCCGGGTGCGCATCGGCACCGAGGGCCGCCCCGGCGGCGGCCCCGGCG 
GTCACTCCGAACCCGGCAGCGCACCCCGCTTCGCCGTCCGCGGGACCCATGTCCCCGTGCACGACGGCACCGCGTA 

40 CCCGCTCTGGAGCGGAACGGCCGTGACCCTGGGCCGTCCGCCCGTCCTGGTCGCCGACGGCCAGGTCCGGCTGCTC 
CTGGCGGGCGAGCTGTACAACCGCGCCGAGCTGACCGGAGCGCTCGGCGGCTCCTCTGCCGCCCTCGGCGACGCCG 
AACTGCTGCTGGCCGCCTGGCGGCGCTGGGGCCCCGGGGCCTTCCGGCTCCTGAACGGACGGTTCGCCGCACTGCT 
CACCGACGCCTCCACCGGCGCGACCGTCGCGGCCACCGACCACGCCGGTTCGGTACCGCTGTGGCTGCGCGCCGAC 
GTGACGGGGCTGAGCGCCGCCACCGAGGCGAAGACCCTGGCGCACGAGCCGGGCCGGCCGCTGGGCCTGTCCGGCA 

45 CCCACACCCGCCGGGGGCGGCGGGCGTCTGCCGGGTCCCCGCCGGGACCGCCCTCCTGCTGCACGGAGTCGGCGGC 
TCCGACATCACCGCCAGGGCGGTCCGCACCTGGACACCCCCGCTCTCCCGGGCGCTGCCCGGCGAACGGGAGGCGG 
TGGACCTGGTCGGCGAACGCCTCGCCACGGCGGTCCGCACCCGGCTGCGCGGCGGGGAGGCGGCCCCCACCGTCGT 
CCTGTCCGGCGGCATCGACTCCGGGGGAGTCGCCGCCCACACGGCGGCCCTGGCACCCGGGACACGGTCCGTGTCG 
ATGGGCACCGAGGTGTCCGACGAGTTCGACGCGGCCCGCTCGGTCGCCGTCCACCTGGGCACCGCGCACAGCGAGA 

50 TCCGGCTCCACTCGGCCGAACTCGTCAGGGAACTGCCCTGGGCGGTCGCCGCCGCGGAGATCACCGACCCCACGGT 
CCTGGAGTACCTGCTGCCGCTCGTCGCCCTCTACCGGCGGCTCGACACCGGGCCGCTCCGCATCCTCACCGGGTAC 
GGCGCCGACATCCCGCTCGGCGGTATGCACCGGCGCACGGCCTCGCTCTGGTCCCTCGACGACGAGATCGCGGGCG 
ACATGGCGGGCTTCGACGGCCTCAACGAGATGTCCCCCGTCCTCGCGGGCATCGCCGGGAAGTGGACCACCCACCC 
GTACTGGGACCGCGCGGTCCTGGACGCGCTGGTCTCCCTCGAACCCGGGCTCAAACGCCGGCGGGGCACCGACAAG 

55 TGGGTGTTGCGGCAGGCCCTCTCCGGCCTGCTGCCCGCCGAGACCGTGGCCCGCCCCAAGCTGGGCATCCACGAGG 
GGTCCGGCACCACCAGCGCGTGGACCGGACTGCTCCTCGCCGAAGGGATCCGGCGCGACGAGGTGACGGCCGTCAA 
GGGCGCCATGGCACGGCGCCTGTACGACGCGGTGGTCATCGACACGGTGCCGCCGGAGGACGTGGACTTCGGCGAG 
ACGGTGCGGCGCTCCGTCGACGCGGTGCGCAGGCTCAGGCTCCAGGGCCGGGTGGTCGTATGA 

60 SEQ ID NO:20 orf4par reverse complement 

GTGTCCACCGCCGTCTCCCCGCGCTACGCCCAACCGGCGACCTTCATGCGGCTGCGCCACCGGCCCGACCCGATCG 
GCCATGACGTGGTGGTCGTCGGCGCCCCGTACGACGGAGGCACCAGCTACCGGCCCGGCGCGCGGTTCGCGCCGCG 
CGCCATCCGGCACGAGTCCAGCCTGATCCACGGCGTCGGCATCGACCGGGGCCCAGGGGTCTTCGACCGGATCGAC 
GTGGTCGACGGGGGCGACATCGACCTCAGCCCCTTCTCGATGGACCTGGCGATGGACACCGCGACGGTCGCCCTGA 
65 CCCGGCTCCTGGAACGCAACGACGCGTTCCTGATGCTGGGCGGGGACCACTCGCTCTCCCTGGCCGCCCTGCGCGC 
CGTGCACGCCCGCCACGGCCGGGTCGCCGTCCTGCACCTGGACGCGCACAGCGACACCAACCCACCCGTCTACGGC 
GGCACCTACCACCACGGCACCCCCTTCCGCTGGGCCATCGAAGAGGGCCTGGTGGACCCGGAGCGCCTGGTCCAGG 
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TCGGCATCCGCGGCCACAATCCGCGGCCCGACTCCCTGGACTACGCGCGCGGGCACGGCGTCAGCATCGTCACCGC 
CGCCGACTTCACCCGGCGCTCACCGC^CGGCATCGCCGAGCAGATCCGGCGCACCGTCGGCGGCCTGCCGCTGTAC 
GTCTCCGTCGACATCGACGTCGTCGACCCGGCGTACGCCCCGGGCACCGGCACACCGGCCCCCGGCGGGCTGTCCT 
CGCGCGAGGTGCTGACCCTGCTCGACGTGGTCGGG<^GCT(^GGCCCGTCGGCTTCGACGTGGTCGAGGTGTCCCC 
5 GGCGTACGACCCGTCGGGGATCACCTCCCTGCTGGCGGCGGAGATCGGGGCCGAACTGCTCTACCAGTACGCCCGC 
GCCACCACGTCGCCCGCGTCGGCACCGGTGGACTCTCCCCTGCCACCGGGGGCGGCGGCGGACGACGCCGAGAACG 
CCGAGAACGCGGTGGACGCGGTGGACGCCGAGAGCGCCGTGGACTTCGCCGGGCAGCGGTGGGGGTAG 

SEQ ID NO:21 cvm6 Polypeptide 

10 VPGSGLEALDRATLIHPTLSGNTAERIVLTSGSGSRVRDT^ 

FHTWGTISNDRAVELAARLVGLSPEPLTRVYFTSGGAEGNEIALRMARLYHHRRGESARTWILSRRSAYHGV 
GVTGFPAYHQGFGPSLPDVDFLTPPQPYRRELFAGSDVTDFCLAELRETIDRIGPERIAAMIGEPIMGAVGAAAPPA 
DYWPRVAELLHSYGILLISDEVITGYGRTGHWFAADHFGWPDIMVTAKGITSGYVPHGAVLTTEAVADEWGDQGF 
PAGFTYSGHATACAVALANLDIIERENLLDNASTVGAYLGKRLAELSD^ 

15 AAVAEALRERAGILLRANGNALIVNPPLIFTQEDADELVAGLRSVIiARTRPDGRVL 

SEQ ID NO:22 cvm3 Polypeptide 

VTRPPGLSAHTHGSVSGSLLRRVAGHYPTGWLVTGPAEAPGQPPPAMVVGTFTSVSLDPVLVGFLPARSSTTWPR 
LRAAGRFCVNVLGADQGPVCRSFAGGDPGRWEVPYRTTATGSPVLLDALAWFDCEVAGETEAGDHWFVTGAVRDL 
20 VIREGS PLVFLRGDYGHWAGGGGSGRAGRRSAVCPV 

SEQ ID NO:23 orf6par Polypeptide 

MRASSPRGFRVHHGHAGIRGSHADLAVIASDVPAAVGAVFTRSRFAAPSVLLSRDAVADGIARGVVVLSGNANAGT 
GPRGYEDAAEVRHLVAGIVDCDERDVLIASTGPVGERYPMSRVRAHLRAVRGPLPGADFDGAAAAVLGTAGARPTI 
25 RRARCGDAT L I GVAKG PGTGPAEQDDRSTLAFFCT DAQVS PWLDDI FRRVADRAFHGLG FGADASTG DTAAVLAN 
GIAGRVDLVAFEQVLGALALDLVRDVVRDSGCGGALVTVROT 

VAAVAGGHGDEGPGRSPGRITIRVGGREVFPAPRDRARPDAVTAYPHGGEVTVHIDLGVPGRAPGAFTVHGCDLLA 
GYPRLGAGRAV. 

30 SEQ ID NO:24 orf4par Polypeptide 

VSTAVSPRYAQPATFMRLRHRPDPIGHDVWVGAPYDGGTSYRPGARFAPRAIRHESSLIHGVGIDRGPGVFDRID 
WDGGDIDLSPFSMDLAMDTATVALTRLLERNDAFLMLGGDHSLSLAALRAVHARHGRVAVLHLDAH 
GTYHHGTPFRWAIEEGLVDPERLVQVGIRGHNPRPDSLDYARGHGVSIVTAADFTRRSPRGIAEQIRRTVGGLPLY 
VSVDIDWDPAYAPGTGTPAPGGLSSREVLTLLDWGQLRPVGFDWEVSPAYDPSGITSLLAAEIGAELLYQYAR 
35 ATTS PASAPVDS PLPPGAAADDAENAENAVDAVDAESAVDFAGQRWG . 

SEQ ID NO:25 orf3par Polypeptide 

MPGPDLVYGFRVRIGTEGRPGGGPGGHSEPGSAPRFAVRGTHVPVHDGTAYPLWSGTAVTLGRPPVLVADGQVRLL 
LAGELYNRMLTGALGGSSAALGDAELLLAAWRRWGPGAFRLLNGRFAALLTDASTGATVAATDHAGSVPLWLRAD 
40 VTGLSAATEAKTLAHEPGRPLGLSGTHTAPGAAGVCRVPAGTALLLHGVGGSDITARAVRTWTPPLSRALPGEREA 
VDLVGERLATAVRTRLRGGEAAPTWLSGGIDSGGVAAHTAALAPGTRSVSMGTEVSDEFDAARSVAVHLGTAHSE 
IRLHSAELVRELPWAVAAAEITDPTVLEYLLPLVALYRRLDTGPLRILTGYGADIPLGGMHRRTASLWSLDDEIAG 
DMAGFDGLNEMSPVLAGIAGKWTTHPYWDRAVLDALVSLEPGLKRRRGTDKWVLRQALSGLLPAETVARPKLGIHE 
GSGTTSAWTGLLLAEGIRRDEVTAVKGAMARRLYDAWI DTVPPEDVDFGETVRRSVDAVRRLRLQGRVW . 

45 

SEQ ID NO:26 orf2par Polypeptide 

MATTTAKAMLERLHQYGVDHVFGWGREASAILFDEVEGLDFVLT 

TNLATGVATSALDRSSVIALAAQSESYDCYPNVTHQCLDSTAVMGPLTKFSVQLERGEDIVNLVDSAVLNSRIEPV 
GPSFISLPVDLLGAELNGTPTDAPLVRATATHALDADWRARLDEAAELVREAENPLLWGSAVIRAGAVDALRALA 
50 ERLNIPWTTYTAKGVLPHDHPLNYGAISGYMDGILGHPALDEIFGPADLLLAIGYDYAEDLRPSMWTRGRAKTTV 
RVAPEWPIPELFRADIDIVTNVAEFVTALDDATSGLAPKTRHDLSALRARVAEFLADPTEYEDGMRVHQVIDCMN 
SVLDNGTFVSDIGFFRHYGVLFAKSDQPYGFLTSAGCSSFGYGLPAAMAAQIARPGEPVFLIAGDGGFHSNSADIE 
TAVRLGLPIVMVVWNDRNGLIELYQNLGHQRSHAPAVGFGSVDFVQLAEANGCEAVRATDRTSLLAALTKGAGLG 
RPFLIEVPVAYDFQSGGFAALAI 
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